jReporter: A Smart Voice-Recording Mobile Application

User Generated Content (UGC) and Citizen Journalism, as well as the constant rise of the use of web for publishing changes the whole landscape of content creating and publishing. Crowdsourcing is part of the practices of the biggest news organizations. Great advances of content creation do not concern solely the utilization of resources created and provided by non-professional users. Mobile Journalism (MoJo) is an emerging field, finding applications mostly in live reporting and breaking news. Journalists tend to make use of the sensor and connectivity capabilities of smart mobile devices, in a similar way citizens do in citizen journalism. Modern devices provide effective packaging of cameras and microphones, as well as global positioning sensors for geographical localization, accelerometers and gyroscopes to monitor the movement and direction of the device. The main target of this work is the investigation and design for automating techniques that can assist on detecting and correcting common audio recording issues. These techniques are embedded in mobile software applications and should be exploited for semi-professional use, enabling audio recording of adequate quality without requiring special equipment or operators.

Overview

The evaluation of sound level measuring mobile applications shows that the development of a sophisticated audio analysis framework for audio-recording purposes may be useful for journalists. In many audio recording scenarios, the repetition of the procedure is not an option, and under unwanted conditions the quality of the capturing is possibly degraded. Many problems are fixed during post-production, but others may make the source material useless. This work introduces a framework for monitoring voice-recording sessions, capable of detecting common mistakes, providing the user with feedback to avoid unwanted conditions, ensuring the improvement of the recording quality. The framework specifies techniques for measuring sound level, estimating reverberation time and performing audio semantic analysis by employing audio processing and feature-based classification.

Proposed Framework

The proposed framework is founded on three processing pipelines: the continuous inspection of the raw input signal, the accurate sound level measuring and the semantic analysis of the audio content. Finally, a decision-making algorithm is capable of detecting inappropriate recording conditions, taking into account the individual outputs of these modules. After investigation for a concrete classification of unwanted recording conditions, five categories were formed: clipping of the signal, high background noise, low signal-to-noise ratio, presence of high-level noise and inappropriate room acoustics with high reverberation time.

Leave a Reply

*
*