AAC Encoder Interface Reference¶
Overview¶
The AAC (Advanced Audio Coding) encoder DirectShow filters provide interfaces for high-quality audio encoding to the AAC format. AAC is the successor to MP3, offering better sound quality at the same bitrate and is the standard audio codec for MP4, M4A, and streaming applications.
Two AAC encoder interfaces are available: - IMonogramAACEncoder: Simple configuration interface using a single configuration structure - IVFAACEncoder: Comprehensive interface with individual property methods for fine-grained control
IMonogramAACEncoder Interface¶
Overview¶
The IMonogramAACEncoder interface provides a simple, structure-based configuration approach for AAC encoding. Configuration is performed using the AACConfig structure that contains all essential encoding parameters.
Interface GUID: {B2DE30C0-1441-4451-A0CE-A914FD561D7F}
Inherits From: IUnknown
AACConfig Structure¶
/// <summary>
/// AAC encoder configuration structure.
/// </summary>
public struct AACConfig
{
/// <summary>
/// AAC version/profile (typically 2 for AAC-LC, 4 for AAC-HE)
/// </summary>
public int version;
/// <summary>
/// Object type / profile:
/// 2 = AAC-LC (Low Complexity) - recommended for most uses
/// 5 = AAC-HE (High Efficiency)
/// 29 = AAC-HEv2 (High Efficiency version 2)
/// </summary>
public int object_type;
/// <summary>
/// Output format type (0 = Raw AAC, 1 = ADTS)
/// </summary>
public int output_type;
/// <summary>
/// Target bitrate in bits per second (e.g., 128000 for 128 kbps)
/// </summary>
public int bitrate;
}
AACInfo Structure¶
/// <summary>
/// AAC encoder runtime information.
/// </summary>
public struct AACInfo
{
/// <summary>
/// Input sample rate in Hz (e.g., 44100, 48000)
/// </summary>
public int samplerate;
/// <summary>
/// Number of audio channels (1 = mono, 2 = stereo, 6 = 5.1, etc.)
/// </summary>
public int channels;
/// <summary>
/// AAC frame size in samples (typically 1024 for AAC-LC)
/// </summary>
public int frame_size;
/// <summary>
/// Total number of frames encoded
/// </summary>
public long frames_done;
}
Interface Definitions¶
C# Definition¶
using System;
using System.Runtime.InteropServices;
namespace VisioForge.DirectShowAPI
{
/// <summary>
/// AAC encoder configuration structure.
/// </summary>
[StructLayout(LayoutKind.Sequential)]
public struct AACConfig
{
public int version;
public int object_type;
public int output_type;
public int bitrate;
}
/// <summary>
/// AAC encoder runtime information.
/// </summary>
[StructLayout(LayoutKind.Sequential)]
public struct AACInfo
{
public int samplerate;
public int channels;
public int frame_size;
public long frames_done;
}
/// <summary>
/// Monogram AAC encoder configuration interface.
/// Provides structure-based configuration for AAC encoding.
/// </summary>
[ComImport]
[Guid("B2DE30C0-1441-4451-A0CE-A914FD561D7F")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
public interface IMonogramAACEncoder
{
/// <summary>
/// Gets the current AAC encoder configuration.
/// </summary>
/// <param name="config">Reference to AACConfig structure to receive current settings</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int GetConfig(ref AACConfig config);
/// <summary>
/// Sets the AAC encoder configuration.
/// </summary>
/// <param name="config">Reference to AACConfig structure containing desired settings</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int SetConfig(ref AACConfig config);
}
}
C++ Definition¶
#include <unknwn.h>
// {B2DE30C0-1441-4451-A0CE-A914FD561D7F}
DEFINE_GUID(IID_IMonogramAACEncoder,
0xb2de30c0, 0x1441, 0x4451, 0xa0, 0xce, 0xa9, 0x14, 0xfd, 0x56, 0x1d, 0x7f);
/// <summary>
/// AAC encoder configuration structure.
/// </summary>
struct AACConfig
{
int version;
int object_type;
int output_type;
int bitrate;
};
/// <summary>
/// AAC encoder runtime information.
/// </summary>
struct AACInfo
{
int samplerate;
int channels;
int frame_size;
__int64 frames_done;
};
/// <summary>
/// Monogram AAC encoder configuration interface.
/// </summary>
DECLARE_INTERFACE_(IMonogramAACEncoder, IUnknown)
{
/// <summary>
/// Gets the current AAC encoder configuration.
/// </summary>
/// <param name="config">Pointer to AACConfig structure to receive settings</param>
/// <returns>S_OK for success</returns>
STDMETHOD(GetConfig)(THIS_
AACConfig* config
) PURE;
/// <summary>
/// Sets the AAC encoder configuration.
/// </summary>
/// <param name="config">Pointer to AACConfig structure with desired settings</param>
/// <returns>S_OK for success</returns>
STDMETHOD(SetConfig)(THIS_
const AACConfig* config
) PURE;
};
Delphi Definition¶
uses
ActiveX, ComObj;
const
IID_IMonogramAACEncoder: TGUID = '{B2DE30C0-1441-4451-A0CE-A914FD561D7F}';
type
/// <summary>
/// AAC encoder configuration structure.
/// </summary>
TAACConfig = record
version: Integer;
object_type: Integer;
output_type: Integer;
bitrate: Integer;
end;
/// <summary>
/// AAC encoder runtime information.
/// </summary>
TAACInfo = record
samplerate: Integer;
channels: Integer;
frame_size: Integer;
frames_done: Int64;
end;
/// <summary>
/// Monogram AAC encoder configuration interface.
/// </summary>
IMonogramAACEncoder = interface(IUnknown)
['{B2DE30C0-1441-4451-A0CE-A914FD561D7F}']
/// <summary>
/// Gets the current AAC encoder configuration.
/// </summary>
function GetConfig(var config: TAACConfig): HRESULT; stdcall;
/// <summary>
/// Sets the AAC encoder configuration.
/// </summary>
function SetConfig(const config: TAACConfig): HRESULT; stdcall;
end;
IVFAACEncoder Interface¶
Overview¶
The IVFAACEncoder interface provides comprehensive, property-based configuration for AAC encoding with individual getter/setter methods for each parameter. This interface offers finer control and is easier to use for incremental configuration changes.
Interface GUID: {0BEF7533-39E6-42a5-863F-E087FAB5D84F}
Inherits From: IUnknown
Interface Definitions¶
C# Definition¶
using System;
using System.Runtime.InteropServices;
namespace VisioForge.DirectShowAPI
{
/// <summary>
/// VisioForge AAC encoder configuration interface.
/// Provides comprehensive property-based control over AAC encoding parameters.
/// </summary>
[ComImport]
[Guid("0BEF7533-39E6-42a5-863F-E087FAB5D84F")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
public interface IVFAACEncoder
{
/// <summary>
/// Forces a specific input sample rate. Set to 0 to accept any rate.
/// </summary>
/// <param name="ulSampleRate">Sample rate in Hz (e.g., 44100, 48000). 0 = any rate</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int SetInputSampleRate(uint ulSampleRate);
/// <summary>
/// Gets the configured input sample rate.
/// </summary>
/// <param name="pulSampleRate">Receives sample rate in Hz. 0 if not fixed</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int GetInputSampleRate(out uint pulSampleRate);
/// <summary>
/// Sets the number of input channels.
/// </summary>
/// <param name="nChannels">Number of channels (1=mono, 2=stereo, 6=5.1, etc.)</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int SetInputChannels(short nChannels);
/// <summary>
/// Gets the number of input channels.
/// </summary>
/// <param name="pnChannels">Receives the number of channels</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int GetInputChannels(out short pnChannels);
/// <summary>
/// Sets the target bitrate. Set to -1 to use maximum bitrate.
/// </summary>
/// <param name="ulBitRate">Bitrate in bits per second (e.g., 128000). -1 = maximum</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int SetBitRate(uint ulBitRate);
/// <summary>
/// Gets the configured bitrate.
/// </summary>
/// <param name="pulBitRate">Receives bitrate in bps. -1 if set to maximum</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int GetBitRate(out uint pulBitRate);
/// <summary>
/// Sets the AAC profile type.
/// </summary>
/// <param name="uProfile">Profile: 2=AAC-LC, 5=AAC-HE, 29=AAC-HEv2</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int SetProfile(uint uProfile);
/// <summary>
/// Gets the current AAC profile.
/// </summary>
/// <param name="puProfile">Receives the profile type</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int GetProfile(out uint puProfile);
/// <summary>
/// Sets the output format.
/// </summary>
/// <param name="uFormat">Format: 0=Raw AAC, 1=ADTS</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int SetOutputFormat(uint uFormat);
/// <summary>
/// Gets the output format.
/// </summary>
/// <param name="puFormat">Receives the output format</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int GetOutputFormat(out uint puFormat);
/// <summary>
/// Sets the time shift value for timestamp adjustment.
/// </summary>
/// <param name="timeShift">Time shift in milliseconds</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int SetTimeShift(int timeShift);
/// <summary>
/// Gets the time shift value.
/// </summary>
/// <param name="ptimeShift">Receives the time shift in milliseconds</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int GetTimeShift(out int ptimeShift);
/// <summary>
/// Enables or disables Low Frequency Effects (LFE) channel.
/// </summary>
/// <param name="lfe">1 to enable LFE, 0 to disable</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int SetLFE(uint lfe);
/// <summary>
/// Gets the LFE channel state.
/// </summary>
/// <param name="p">Receives LFE state (1=enabled, 0=disabled)</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int GetLFE(out uint p);
/// <summary>
/// Enables or disables Temporal Noise Shaping (TNS).
/// TNS improves encoding of transient sounds.
/// </summary>
/// <param name="tns">1 to enable TNS, 0 to disable</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int SetTNS(uint tns);
/// <summary>
/// Gets the TNS state.
/// </summary>
/// <param name="p">Receives TNS state (1=enabled, 0=disabled)</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int GetTNS(out uint p);
/// <summary>
/// Enables or disables Mid-Side stereo coding.
/// Can improve compression for stereo audio.
/// </summary>
/// <param name="v">1 to enable mid-side coding, 0 to disable</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int SetMidSide(uint v);
/// <summary>
/// Gets the mid-side coding state.
/// </summary>
/// <param name="p">Receives mid-side state (1=enabled, 0=disabled)</param>
/// <returns>HRESULT (0 for success)</returns>
[PreserveSig]
int GetMidSide(out uint p);
}
}
C++ Definition¶
#include <unknwn.h>
// {0BEF7533-39E6-42a5-863F-E087FAB5D84F}
DEFINE_GUID(IID_IVFAACEncoder,
0x0bef7533, 0x39e6, 0x42a5, 0x86, 0x3f, 0xe0, 0x87, 0xfa, 0xb5, 0xd8, 0x4f);
/// <summary>
/// VisioForge AAC encoder configuration interface.
/// </summary>
DECLARE_INTERFACE_(IVFAACEncoder, IUnknown)
{
STDMETHOD(SetInputSampleRate)(THIS_
unsigned long ulSampleRate
) PURE;
STDMETHOD(GetInputSampleRate)(THIS_
unsigned long* pulSampleRate
) PURE;
STDMETHOD(SetInputChannels)(THIS_
short nChannels
) PURE;
STDMETHOD(GetInputChannels)(THIS_
short* pnChannels
) PURE;
STDMETHOD(SetBitRate)(THIS_
unsigned long ulBitRate
) PURE;
STDMETHOD(GetBitRate)(THIS_
unsigned long* pulBitRate
) PURE;
STDMETHOD(SetProfile)(THIS_
unsigned long uProfile
) PURE;
STDMETHOD(GetProfile)(THIS_
unsigned long* puProfile
) PURE;
STDMETHOD(SetOutputFormat)(THIS_
unsigned long uFormat
) PURE;
STDMETHOD(GetOutputFormat)(THIS_
unsigned long* puFormat
) PURE;
STDMETHOD(SetTimeShift)(THIS_
int timeShift
) PURE;
STDMETHOD(GetTimeShift)(THIS_
int* ptimeShift
) PURE;
STDMETHOD(SetLFE)(THIS_
unsigned long lfe
) PURE;
STDMETHOD(GetLFE)(THIS_
unsigned long* p
) PURE;
STDMETHOD(SetTNS)(THIS_
unsigned long tns
) PURE;
STDMETHOD(GetTNS)(THIS_
unsigned long* p
) PURE;
STDMETHOD(SetMidSide)(THIS_
unsigned long v
) PURE;
STDMETHOD(GetMidSide)(THIS_
unsigned long* p
) PURE;
};
Delphi Definition¶
uses
ActiveX, ComObj;
const
IID_IVFAACEncoder: TGUID = '{0BEF7533-39E6-42a5-863F-E087FAB5D84F}';
type
/// <summary>
/// VisioForge AAC encoder configuration interface.
/// </summary>
IVFAACEncoder = interface(IUnknown)
['{0BEF7533-39E6-42a5-863F-E087FAB5D84F}']
function SetInputSampleRate(ulSampleRate: Cardinal): HRESULT; stdcall;
function GetInputSampleRate(out pulSampleRate: Cardinal): HRESULT; stdcall;
function SetInputChannels(nChannels: SmallInt): HRESULT; stdcall;
function GetInputChannels(out pnChannels: SmallInt): HRESULT; stdcall;
function SetBitRate(ulBitRate: Cardinal): HRESULT; stdcall;
function GetBitRate(out pulBitRate: Cardinal): HRESULT; stdcall;
function SetProfile(uProfile: Cardinal): HRESULT; stdcall;
function GetProfile(out puProfile: Cardinal): HRESULT; stdcall;
function SetOutputFormat(uFormat: Cardinal): HRESULT; stdcall;
function GetOutputFormat(out puFormat: Cardinal): HRESULT; stdcall;
function SetTimeShift(timeShift: Integer): HRESULT; stdcall;
function GetTimeShift(out ptimeShift: Integer): HRESULT; stdcall;
function SetLFE(lfe: Cardinal): HRESULT; stdcall;
function GetLFE(out p: Cardinal): HRESULT; stdcall;
function SetTNS(tns: Cardinal): HRESULT; stdcall;
function GetTNS(out p: Cardinal): HRESULT; stdcall;
function SetMidSide(v: Cardinal): HRESULT; stdcall;
function GetMidSide(out p: Cardinal): HRESULT; stdcall;
end;
AAC Profiles and Configuration¶
AAC Profiles¶
AAC-LC (Low Complexity) - Profile 2 (Recommended): - Best quality-to-bitrate ratio - Lowest computational complexity - Universal decoder support - Use for: Music, podcasts, video soundtracks - Bitrate range: 64-320 kbps AAC-HE (High Efficiency) - Profile 5: - Optimized for low bitrates - Uses Spectral Band Replication (SBR) - Better quality than AAC-LC at low bitrates (<= 64 kbps) - Use for: Streaming, voice, low-bitrate applications - Bitrate range: 32-80 kbps AAC-HEv2 (High Efficiency version 2) - Profile 29: - Further optimized for very low bitrates - Uses Parametric Stereo (PS) in addition to SBR - Best for mono/stereo at extremely low bitrates - Use for: Voice streaming, very low bandwidth - Bitrate range: 16-40 kbps
Output Formats¶
Raw AAC (Format 0): - Pure AAC bitstream without container - Requires external container (MP4, M4A, MKV) - Use for: Muxing into MP4/M4A files - Smallest output size ADTS (Audio Data Transport Stream) - Format 1: - AAC with frame headers - Self-contained, can be played directly - Slightly larger than raw AAC - Use for: Standalone AAC files, streaming - Better error resilience