#StackBounty: #xamarin #audio #xamarin.android #microsoft-cognitive Connect to Microsoft's Cognitive Speaker Recognition API via Xa…

Bounty: 50

I was building a test application to authenticate users via Microsoft's Cognitive Speaker Recognition API. It seems straightforward, but as mentioned in their API Docs, while creating the Enrollment, I need to send the byte[] of the audio file I record. Now, since I am using Xamarin.Android, I was able to record the audio and save it. Now, the requirements of THAT audio is pretty specific by Microsoft's Cognitive Speaker Recognition API.

According to the API docs, the audio file format must meet the following requirements.

Container -> WAV
Encoding -> PCM
Rate -> 16K
Sample Format -> 16 bit
Channels -> Mono

Following this recipe I successfully recorded the audio and after playing around a little and with some android docs, I was able to implement these settings as well :

_recorder.SetOutputFormat(OutputFormat.ThreeGpp);

_recorder.SetAudioChannels(1);
_recorder.SetAudioSamplingRate(16);
_recorder.SetAudioEncodingBitRate(16000);

_recorder.SetAudioEncoder((AudioEncoder) Encoding.Pcm16bit);

This meets most of the criteria of the required audio file. But, I cannot seem to save the file in actual “.wav” format and I cannot verify whether the file is actually being PCM encoded or not.

Here’s my AXML and MainActivity.cs : Github Gist

I also followed this code and incorporated it in my code : Github Gist

The file’s specs look just fine, but the duration is wrong. No matter how long I record, it just shows 250ms, which results in too-short audio.

Is there any way to do this? Basically I just want to be able to connect to Microsoft's Cognitive Speaker Recognition API via Xamarin.Android. I couldn’t find any such resource to guide myself.


Get this bounty!!!

Leave a Reply