Android Audio Playback AudioTrack Detailed Explanation

The previous articles introduced knowledge related to audio and video in Android development, such as MediaCodec, MediaMuxer, AudioRecord, etc. These are essential for Android audio and video development. The links to the related articles are as follows:

In Android, commonly used interfaces for playing audio include MediaPlayer, AudioTrack, and SoundPool. The most commonly used for audio rendering are AudioTrack and OpenSL ES. The following will introduce the relevant knowledge of AudioTrack, with the main content as follows:

Introduction to AudioTrack
Creating an AudioTrack
Writing Audio Data to AudioTrack
Lifecycle of AudioTrack
Using AudioTrack

Introduction to AudioTrack#

AudioTrack is used to play raw pcm format audio data. AudioTrack has two playback modes:

MODE_STATIC: In this mode, the audio data is written to the audio buffer at once. It is suitable for scenarios with limited memory and minimal delay playback of short sounds, such as game sound effects, ringtones, system prompts, etc. This mode has the least overhead.
MODE_STREAM: In this mode, audio data is continuously written. It is suitable for scenarios that require continuous reception of audio data. This mode is mainly used when the audio data has a long duration or the audio characteristics (high sampling rate, higher bit depth, etc.) prevent it from being written into memory all at once. This mode is used for normal playback of PCM raw audio data.

Compared to MediaPlayer, which can play different types and formats of sound files and creates corresponding audio decoders at the lower level, AudioTrack only accepts PCM raw audio data. MediaPlayer still creates an AudioTrack at the lower level and passes the decoded PCM stream to AudioTrack. AudioTrack then passes it to AudioFlinger for mixing before it is played by the hardware.

Creating an AudioTrack#

AudioTrack is created using the following method:

// Starting from Android 5.0
AudioTrack(
    attributes: AudioAttributes!, 
    format: AudioFormat!, 
    bufferSizeInBytes: Int, 
    mode: Int, 
    sessionId: Int)

The meanings of the parameters corresponding to the above constructor are as follows:

attributes: Represents the attribute set of the audio stream information. Starting from Android 5.0, AudioAttributes is used to replace the stream type setting. It can convey more information than the stream type setting and is commonly used to set the purpose and content of the audio.
format: Represents the audio format accepted by AudioTrack. For linear PCM, it reflects the sample size (8, 16, 32 bits) and representation (integer, floating-point). The audio format is defined in AudioFormat. Among the commonly used audio data formats, only AudioFormat.ENCODING_PCM_16BIT can be guaranteed to work properly on all devices. AudioFormat.ENCODING_PCM_8BIT, for example, may not work properly on all devices.
bufferSizeInBytes: Represents the size of the audio data buffer, in bytes. Its size is generally a nonzero multiple of the audio frame size. If the playback mode is MODE_STATIC, the buffer size is the size of the audio being played. If the playback mode is MODE_STREAM, the buffer size cannot be smaller than the minimum buffer size, which is the size returned by getMinBufferSize.
mode: Represents the playback mode. AudioTrack provides two modes: MODE_STATIC and MODE_STREAM. MODE_STATIC writes the audio resource to the audio buffer at once. It is suitable for scenarios with low delay and low memory usage, such as ringtones and system prompts. MODE_STREAM is suitable for scenarios that require continuous data writing through the write method. Compared to MODE_STATIC, it has some delay but can continuously receive audio data.
sessionId: Audio session ID. Here, AudioManager.AUDIO_SESSION_ID_GENERATE is used to let the underlying audio framework generate the session ID.

Writing Audio Data to AudioTrack#

Whether it is in stream mode (STREAM_MODE) or static buffer mode (STATIC_MODE), audio data needs to be written to the audio track using the write method for playback. The main write methods are as follows:

// The format specified in the AudioTrack constructor should be AudioFormat#ENCODING_PCM_8BIT
open fun write(audioData: ByteArray, offsetInBytes: Int, sizeInBytes: Int): Int
// The format specified in the AudioTrack constructor should be AudioFormat#ENCODING_PCM_16BIT 
open fun write(audioData: ShortArray, offsetInShorts: Int, sizeInShorts: Int): Int
// The format specified in the AudioTrack constructor should be AudioFormat#ENCODING_PCM_FLOAT
open fun write(audioData: FloatArray, offsetInFloats: Int, sizeInFloats: Int, writeMode: Int): Int

The return value of writing audio data is greater than or equal to 0. Common exceptions when reading audio data are as follows:

ERROR_INVALID_OPERATION: Indicates that AudioTrack is not initialized.
ERROR_BAD_VALUE: Indicates that the parameters are invalid.
ERROR_DEAD_OBJECT: Indicates that an error code is not returned when some audio data has been transmitted. The error code will be returned at the next write call.

This is similar to the read function in AudioRecord. For specific details, please refer to the official documentation.

Lifecycle of AudioTrack#

The lifecycle of AudioTrack mainly includes STATE_UNINITIALIZED, STATE_INITIALIZED, and STATE_NO_STATIC_DATA. STATE_INITIALIZED corresponds to STREAM_MODE, and STATE_NO_STATIC_DATA corresponds to STATIC_MODE. The playback status is not very important. The diagram below shows the lifecycle:

Mermaid Loading...

Using AudioTrack#

The main use of AudioTrack is to read data from a PCM file and write the read audio to AudioTrack for playback. The key code is as follows:

// Initialize AudioTrack
private fun initAudioTrack() {
    bufferSize = AudioTrack
        .getMinBufferSize(SAMPLE_RATE, CHANNEL_CONFIG, AUDIO_FORMAT)
    attributes = AudioAttributes.Builder()
        .setUsage(AudioAttributes.USAGE_MEDIA) // Set the usage of the audio
        .setContentType(AudioAttributes.CONTENT_TYPE_MUSIC) // Set the content type of the audio
        .build()
    audioFormat = AudioFormat.Builder()
        .setSampleRate(SAMPLE_RATE)
        .setChannelMask(AudioFormat.CHANNEL_OUT_MONO)
        .setEncoding(AudioFormat.ENCODING_PCM_16BIT)
        .build()
    audioTrack = AudioTrack(
        attributes, audioFormat, bufferSize,
        AudioTrack.MODE_STREAM, AudioManager.AUDIO_SESSION_ID_GENERATE
    )
}
// Write audio data to AudioTrack
private fun writeAudioData(){
    scope.launch(Dispatchers.IO){
        val pcmFile = File(pcmFilePath)
        val ins = FileInputStream(pcmFile)
        val bytes = ByteArray(bufferSize)
        var len: Int
        while (ins.read(bytes).also { len = it } > 0){
            audioTrack.write(bytes, 0, len)
        }
        audioTrack.stop()
    }
}
// Start playback
private fun start(){
    audioTrack.play()
    writeAudioData()
}

The basic usage of AudioTrack is as described above. If you are interested in the code related to playing audio with AudioTrack, please leave a comment to obtain it.