The previous articles introduced knowledge related to audio and video in Android development, such as MediaCodec, MediaMuxer, AudioRecord, etc. These are essential for Android audio and video development. The links to the related articles are as follows:
- Camera2, MediaCodec Recording MP4
- Android Native Encoding and Decoding Interface MediaCodec Detailed Explanation
- AudioRecord Audio Data Collection and Synthesis - jzman
In Android, commonly used interfaces for playing audio include MediaPlayer, AudioTrack, and SoundPool. The most commonly used for audio rendering are AudioTrack and OpenSL ES. The following will introduce the relevant knowledge of AudioTrack, with the main content as follows:
- Introduction to AudioTrack
- Creating an AudioTrack
- Writing Audio Data to AudioTrack
- Lifecycle of AudioTrack
- Using AudioTrack
Introduction to AudioTrack#
AudioTrack is used to play raw pcm format audio data. AudioTrack has two playback modes:
MODE_STATIC: In this mode, the audio data is written to the audio buffer at once. It is suitable for scenarios with limited memory and minimal delay playback of short sounds, such as game sound effects, ringtones, system prompts, etc. This mode has the least overhead.MODE_STREAM: In this mode, audio data is continuously written. It is suitable for scenarios that require continuous reception of audio data. This mode is mainly used when the audio data has a long duration or the audio characteristics (high sampling rate, higher bit depth, etc.) prevent it from being written into memory all at once. This mode is used for normal playback ofPCMraw audio data.
Compared to MediaPlayer, which can play different types and formats of sound files and creates corresponding audio decoders at the lower level, AudioTrack only accepts PCM raw audio data. MediaPlayer still creates an AudioTrack at the lower level and passes the decoded PCM stream to AudioTrack. AudioTrack then passes it to AudioFlinger for mixing before it is played by the hardware.
Creating an AudioTrack#
AudioTrack is created using the following method:
// Starting from Android 5.0
AudioTrack(
attributes: AudioAttributes!,
format: AudioFormat!,
bufferSizeInBytes: Int,
mode: Int,
sessionId: Int)
The meanings of the parameters corresponding to the above constructor are as follows:
- attributes: Represents the attribute set of the audio stream information. Starting from Android 5.0,
AudioAttributesis used to replace the stream type setting. It can convey more information than the stream type setting and is commonly used to set the purpose and content of the audio. - format: Represents the audio format accepted by
AudioTrack. For linear PCM, it reflects the sample size (8, 16, 32 bits) and representation (integer, floating-point). The audio format is defined inAudioFormat. Among the commonly used audio data formats, onlyAudioFormat.ENCODING_PCM_16BITcan be guaranteed to work properly on all devices.AudioFormat.ENCODING_PCM_8BIT, for example, may not work properly on all devices. - bufferSizeInBytes: Represents the size of the audio data buffer, in bytes. Its size is generally a nonzero multiple of the audio frame size. If the playback mode is
MODE_STATIC, the buffer size is the size of the audio being played. If the playback mode isMODE_STREAM, the buffer size cannot be smaller than the minimum buffer size, which is the size returned bygetMinBufferSize. - mode: Represents the playback mode.
AudioTrackprovides two modes:MODE_STATICandMODE_STREAM.MODE_STATICwrites the audio resource to the audio buffer at once. It is suitable for scenarios with low delay and low memory usage, such as ringtones and system prompts.MODE_STREAMis suitable for scenarios that require continuous data writing through thewritemethod. Compared toMODE_STATIC, it has some delay but can continuously receive audio data. - sessionId: Audio session ID. Here,
AudioManager.AUDIO_SESSION_ID_GENERATEis used to let the underlying audio framework generate the session ID.
Writing Audio Data to AudioTrack#
Whether it is in stream mode (STREAM_MODE) or static buffer mode (STATIC_MODE), audio data needs to be written to the audio track using the write method for playback. The main write methods are as follows:
// The format specified in the AudioTrack constructor should be AudioFormat#ENCODING_PCM_8BIT
open fun write(audioData: ByteArray, offsetInBytes: Int, sizeInBytes: Int): Int
// The format specified in the AudioTrack constructor should be AudioFormat#ENCODING_PCM_16BIT
open fun write(audioData: ShortArray, offsetInShorts: Int, sizeInShorts: Int): Int
// The format specified in the AudioTrack constructor should be AudioFormat#ENCODING_PCM_FLOAT
open fun write(audioData: FloatArray, offsetInFloats: Int, sizeInFloats: Int, writeMode: Int): Int
The return value of writing audio data is greater than or equal to 0. Common exceptions when reading audio data are as follows:
- ERROR_INVALID_OPERATION: Indicates that
AudioTrackis not initialized. - ERROR_BAD_VALUE: Indicates that the parameters are invalid.
- ERROR_DEAD_OBJECT: Indicates that an error code is not returned when some audio data has been transmitted. The error code will be returned at the next
writecall.
This is similar to the read function in AudioRecord. For specific details, please refer to the official documentation.
Lifecycle of AudioTrack#
The lifecycle of AudioTrack mainly includes STATE_UNINITIALIZED, STATE_INITIALIZED, and STATE_NO_STATIC_DATA. STATE_INITIALIZED corresponds to STREAM_MODE, and STATE_NO_STATIC_DATA corresponds to STATIC_MODE. The playback status is not very important. The diagram below shows the lifecycle:
Using AudioTrack#
The main use of AudioTrack is to read data from a PCM file and write the read audio to AudioTrack for playback. The key code is as follows:
// Initialize AudioTrack
private fun initAudioTrack() {
bufferSize = AudioTrack
.getMinBufferSize(SAMPLE_RATE, CHANNEL_CONFIG, AUDIO_FORMAT)
attributes = AudioAttributes.Builder()
.setUsage(AudioAttributes.USAGE_MEDIA) // Set the usage of the audio
.setContentType(AudioAttributes.CONTENT_TYPE_MUSIC) // Set the content type of the audio
.build()
audioFormat = AudioFormat.Builder()
.setSampleRate(SAMPLE_RATE)
.setChannelMask(AudioFormat.CHANNEL_OUT_MONO)
.setEncoding(AudioFormat.ENCODING_PCM_16BIT)
.build()
audioTrack = AudioTrack(
attributes, audioFormat, bufferSize,
AudioTrack.MODE_STREAM, AudioManager.AUDIO_SESSION_ID_GENERATE
)
}
// Write audio data to AudioTrack
private fun writeAudioData(){
scope.launch(Dispatchers.IO){
val pcmFile = File(pcmFilePath)
val ins = FileInputStream(pcmFile)
val bytes = ByteArray(bufferSize)
var len: Int
while (ins.read(bytes).also { len = it } > 0){
audioTrack.write(bytes, 0, len)
}
audioTrack.stop()
}
}
// Start playback
private fun start(){
audioTrack.play()
writeAudioData()
}
The basic usage of AudioTrack is as described above. If you are interested in the code related to playing audio with AudioTrack, please leave a comment to obtain it.