project:build.sh: Added fastboot support; custom modifications to U-Boot and kernel implemented using patches.

project:cfg:BoardConfig_IPC: Added fastboot BoardConfig file and firmware post-scripts, distinguishing between
the BoardConfigs for Luckfox Pico Pro and Luckfox Pico Max. project:app: Added fastboot_client and rk_smart_door
for quick boot applications; updated rkipc app to adapt to the latest media library. media:samples: Added more
usage examples. media:rockit: Fixed bugs; removed support for retrieving data frames from VPSS. media:isp:
Updated rkaiq library and related tools to support connection to RKISP_Tuner. sysdrv:Makefile: Added support for
compiling drv_ko on Luckfox Pico Ultra W using Ubuntu; added support for custom root filesystem.
sysdrv:tools:board: Updated Buildroot optional mirror sources, updated some software versions, and stored device
tree files and configuration files that undergo multiple modifications for U-Boot and kernel separately.
sysdrv:source:mcu: Used RISC-V MCU SDK with RT-Thread system, mainly for initializing camera AE during quick
boot. sysdrv:source:uboot: Added support for fastboot; added high baud rate DDR bin for serial firmware upgrades.
sysdrv:source:kernel: Upgraded to version 5.10.160; increased NPU frequency for RV1106G3; added support for
fastboot.

Signed-off-by: luckfox-eng29 <eng29@luckfox.com>
This commit is contained in:
luckfox-eng29
2024-08-21 10:05:47 +08:00
parent e79fd21975
commit 8f34c2760d
20902 changed files with 6567362 additions and 11248383 deletions

2
.gitignore vendored
View File

@@ -6,3 +6,5 @@ IMAGE/
output/ output/
project/app/wifi_app/wpa_supplicant.conf project/app/wifi_app/wpa_supplicant.conf
config/ config/
sysdrv/tools/board/memtester/memtester-4.6.0/
sysdrv/tools/board/dosfstools/dosfstools-4.2/

View File

@@ -5,17 +5,13 @@
* It provides a customized SDK specifically for Luckfox Pico series development boards  * It provides a customized SDK specifically for Luckfox Pico series development boards 
* Aimed at providing developers with a better programming experience * Aimed at providing developers with a better programming experience
## SDK Updatelog ## SDK Updatelog
* Current version V1.3 + Current Version: V1.4
1. Added support for Luckfox-pico-Ultra and Luckfox-pico-Ultra-W 1. Updated U-Boot to support fast boot for RV1106 using SPI NAND and eMMC.
2. Optimized the selection process for board support files 2. Optimized U-Boot compatibility with SD cards, reducing the likelihood of SD card recognition failures.
3. Improved the download speed of buildroot by selecting the fastest mirror based on the download environment 3. Updated the kernel version to 5.10.160, increasing the NPU frequency for RV1106G3.
4. Enhanced buildroot package management operations; added the `buildrootconfig` option to the `build.sh` command to directly enter buildroot's menuconfig 4. Updated the Buildroot mirror source for more stable package downloads.
5. Improved the rootfs clean operation to retain Buildroot already downloaded packages 5. Added support for custom file systems.
6. Enhanced kernel configuration operations; added the `kernelconfig` option to the `build.sh` command to enter the kernel's menuconfig 6. Partial bug fixes
7. Added a `config` folder for quick configuration of device trees, kernel, and buildroot
8. Optimized the system's root filesystem packaging process, allowing customization of root files in the `<Luckfox-pico SDK PATH>/output/out/rootfs_uclibc_rv1106` folder
9. Modified the default device tree configuration, enabling pin and interface function configuration on the board system using the `luckfox-config` command
10. Partial bug fixes
## SDK Usage Instructions ## SDK Usage Instructions
* recommended operating system : Ubuntu 22.04 * recommended operating system : Ubuntu 22.04
### Installing Dependencies ### Installing Dependencies

View File

@@ -5,19 +5,15 @@
* 专为Luckfox Pico系列开发板提供客制化的SDK * 专为Luckfox Pico系列开发板提供客制化的SDK
* 旨在为开发者提供更好的编程体验 * 旨在为开发者提供更好的编程体验
## SDK 更新日志 ## SDK 更新日志
* 当前版本 V1.3 + 当前版本 V1.4
1. 添加Luckfox-pico-Ultra和Luckfox-pico-Ultra-W支持 1. 更新uboot提供rv1106使用spi_nand和emmc快速启动的支持
2. 优化了板级支持文件的选择操作 2. 优化了uboot对SD卡的兼容性减少识别SD卡失败的概率
3. 优化了buildroot的软件包下载速度会根据下载环境选择速度较快的源服务器 3. 更新内核版本为5.10.160提高rv1106g3的npu频率
4. 优化了buildroot的包管理操作,`build.sh` 命令添加了 `buildrootconfig` 选项可以直接进入buildroot的menuconfig 4. 更新buildroot的镜像源,使软件包下载更加稳定
5. 优化了rootfs的clean操作可以保留buildroot已经下载的软件包 5. 添加了自定义文件系统的支持
6. 优化了kernel的配置操作`build.sh` 命令添加了 `kernelconfig` 选项可以进入kernel的menuconfig 6. 部分bug修复
7. 添加了 `config` 配置文件夹可以快速配置设备树、内核和Buildroot
8. 优化了系统的根文件系统打包流程,可以在 `<Luckfox-pico SDK PATH>/output/out/rootfs_uclibc_rv1106`文件夹下对根文件进行客制化修改
9. 修改了默认设备树的配置,可以在板端系统使用`luckfox-config`命令配置引脚和接口功能
10. 部分bug修复
## SDK 使用说明 ## SDK 使用说明
* 推荐使用系统为Ubuntu 22.04 * 推荐SDK使用系统环境为Ubuntu 22.04
### 安装依赖 ### 安装依赖
```shell ```shell
sudo apt-get install repo git ssh make gcc gcc-multilib g++-multilib module-assistant expect g++ gawk texinfo libssl-dev bison flex fakeroot cmake unzip gperf autoconf device-tree-compiler libncurses5-dev pkg-config sudo apt-get install repo git ssh make gcc gcc-multilib g++-multilib module-assistant expect g++ gawk texinfo libssl-dev bison flex fakeroot cmake unzip gperf autoconf device-tree-compiler libncurses5-dev pkg-config

View File

@@ -1,4 +1,11 @@
# Updatelog # Updatelog
## V1.4 Updatelog
1. Updated U-Boot to support fast boot for RV1106 using SPI NAND and eMMC.
2. Optimized U-Boot compatibility with SD cards, reducing the likelihood of SD card recognition failures.
3. Updated the kernel version to 5.10.160, increasing the NPU frequency for RV1106G3.
4. Updated the Buildroot mirror source for more stable package downloads.
5. Added support for custom file systems.
6. Partial bug fixes
## V1.3 Updatelog ## V1.3 Updatelog
1. Added support for Luckfox-pico-Ultra and Luckfox-pico-Ultra-W 1. Added support for Luckfox-pico-Ultra and Luckfox-pico-Ultra-W
2. Optimized the selection process for board support files 2. Optimized the selection process for board support files

View File

@@ -1,4 +1,11 @@
# 更新日志 # 更新日志
## V1.4 更新日志
1. 更新uboot提供rv1106使用spi_nand和emmc快速启动的支持
2. 优化了uboot对SD卡的兼容性减少识别SD卡失败的概率
3. 更新内核版本为5.10.160提高rv1106g3的npu频率
4. 更新buildroot的镜像源使软件包下载更加稳定
5. 添加了自定义文件系统的支持
6. 部分bug修复
## V1.3 更新日志 ## V1.3 更新日志
1. 添加Luckfox-pico-Ultra和Luckfox-pico-Ultra-W支持 1. 添加Luckfox-pico-Ultra和Luckfox-pico-Ultra-W支持
2. 优化了板级支持文件的选择操作 2. 优化了板级支持文件的选择操作

63
media/.gitignore vendored
View File

@@ -1,33 +1,36 @@
*/build */build
*/out */out
alsa-lib/alsa-lib-1.1.5/ #alsa-lib/alsa-lib-1.1.5/
out/ #avs
#cfg
#common_algorithm/common_algorithm/
#isp/camera_engine_rkaiq
#iva
#ive
#libdrm/libdrm-2.4.89/
#libv4l/argp-standalone-1.3/
#libv4l/v4l-utils-1.16.5/
#lvgl
#mali/
#mpp/mpp/
#npu
#npu/rknpu/
#npu/rknpu2/
#npu/rockface/
#npu/rockx/
out out
# avs #rga/rga/
# cfg #rkadk/
# common_algorithm/common_algorithm/ #rkadk/rkadk
# isp/camera_engine_rkaiq #rkfsmk/
# iva #rkfsmk/rkfsmk
# ive #rkmedia
# libdrm/libdrm-2.4.89/ #rkmedia/rkmedia
# libv4l/argp-standalone-1.3/ #rkpostisp
# libv4l/v4l-utils-1.16.5/ #rockauto/
# lvgl #rockit/rockit/
# mali/ #samples
# mpp/mpp/ #security/bin/
# npu #security/librkcrypto
# npu/rknpu/ #security/rk_tee_user/
# npu/rknpu2/ #sysutils
# npu/rockface/
# npu/rockx/
# rga/rga/
# rkadk/
# rkadk/rkadk
# rkfsmk/
# rkfsmk/rkfsmk
# rkmedia
# rkmedia/rkmedia
# rockit/rockit/
# samples
# security/librkcrypto
# sysutils

View File

@@ -19,18 +19,23 @@ include $(MAKEFILE_DIR)/Makefile.param
media_src := $(wildcard ./*/Makefile) media_src := $(wildcard ./*/Makefile)
media_src := $(dir $(media_src)) media_src := $(dir $(media_src))
media_src := $(filter-out ./samples/,$(media_src))
################################################################################ ################################################################################
## build target ## build target
################################################################################ ################################################################################
all: all: media_libs
make -C ./samples
$(call MAROC_COPY_PKG_TO_MEDIA_OUTPUT, $(RK_PROJECT_PATH_MEDIA), $(RK_MEDIA_OUTPUT))
media_libs:
@rm -rf $(RK_MEDIA_OUTPUT) @rm -rf $(RK_MEDIA_OUTPUT)
$(foreach target,$(media_src),make -C $(target)||exit -1;) $(foreach target,$(media_src),make -C $(target)||exit -1;)
$(call MAROC_COPY_PKG_TO_MEDIA_OUTPUT, $(RK_PROJECT_PATH_MEDIA), $(RK_MEDIA_OUTPUT))
distclean: clean distclean: clean
clean: clean:
$(foreach target,$(media_src),make distclean -C $(target)||exit -1;) $(foreach target,$(media_src),make distclean -C $(target)||exit -1;)
make -C ./samples distclean
@rm -rf $(RK_MEDIA_OUTPUT) @rm -rf $(RK_MEDIA_OUTPUT)
info: info:

View File

@@ -36,7 +36,13 @@ C_CYAN = \e[36;1m
C_WHITE = \e[37;1m C_WHITE = \e[37;1m
C_NORMAL = \033[0m C_NORMAL = \033[0m
RK_MEDIA_OPTS := -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 RK_MEDIA_OPTS := -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -ffunction-sections -fdata-sections
ifeq ($(RK_BUILD_VERSION_TYPE),DEBUG)
RK_MEDIA_OPTS += -O0
else
RK_MEDIA_OPTS += -Os
endif
RK_MEDIA_CROSS_CFLAGS += $(RK_MEDIA_OPTS) RK_MEDIA_CROSS_CFLAGS += $(RK_MEDIA_OPTS)
RK_MEDIA_TOP_DIR := $(MAKE_PARAM_DIR) RK_MEDIA_TOP_DIR := $(MAKE_PARAM_DIR)
@@ -49,6 +55,12 @@ RK_MEDIA_ARCH_TYPE := arm
RK_MEDIA_CROSS_CFLAGS += -march=armv7-a -mfpu=neon -mfloat-abi=hard RK_MEDIA_CROSS_CFLAGS += -march=armv7-a -mfpu=neon -mfloat-abi=hard
endif endif
ifeq ($(RK_MEDIA_CROSS),arm-rockchip1050-linux-uclibcgnueabihf)
RK_MEDIA_LIB_TYPE := uclibc
RK_MEDIA_ARCH_TYPE := arm
RK_MEDIA_CROSS_CFLAGS += -march=armv7-a -mfpu=neon -mfloat-abi=hard
endif
ifeq ($(RK_MEDIA_CROSS),arm-rockchip830-linux-gnueabihf) ifeq ($(RK_MEDIA_CROSS),arm-rockchip830-linux-gnueabihf)
RK_MEDIA_LIB_TYPE := glibc RK_MEDIA_LIB_TYPE := glibc
RK_MEDIA_ARCH_TYPE := arm RK_MEDIA_ARCH_TYPE := arm

View File

@@ -34,6 +34,7 @@ alsa_lib-build: pre-built
cp -af $(CURRENT_DIR)/$(PATCH_DIR)/* ./; \ cp -af $(CURRENT_DIR)/$(PATCH_DIR)/* ./; \
$(SHELL) ./alsa-lib.patch.sh; \ $(SHELL) ./alsa-lib.patch.sh; \
autoreconf -f -i; \ autoreconf -f -i; \
CFLAGS="$(RK_MEDIA_CROSS_CFLAGS)" \
./configure --target=$(RK_MEDIA_CROSS) \ ./configure --target=$(RK_MEDIA_CROSS) \
--host=$(RK_MEDIA_CROSS) \ --host=$(RK_MEDIA_CROSS) \
--prefix=$(CURRENT_DIR)/$(PKG_BIN)/ \ --prefix=$(CURRENT_DIR)/$(PKG_BIN)/ \

View File

@@ -8,7 +8,8 @@ export LC_ALL=C
SHELL:=/bin/bash SHELL:=/bin/bash
CURRENT_DIR := $(shell pwd) CURRENT_DIR := $(shell pwd)
PKG_TARBALL := middle_lut PKG_TARBALL := avs_calib
PKG_TARBALL += avs_lut
PKG_LIB_INSTALL_PATH := lib PKG_LIB_INSTALL_PATH := lib
PKG_BIN ?= out PKG_BIN ?= out
@@ -22,14 +23,14 @@ endif
all: $(PKG_TARGET) all: $(PKG_TARGET)
@echo "build $(PKG_NAME) done"; @echo "build $(PKG_NAME) done";
$(call MAROC_COPY_PKG_TO_MEDIA_OUTPUT, $(RK_MEDIA_OUTPUT), $(PKG_BIN))
avs-build: avs-build:
@rm -rf $(PKG_BIN); @test -f $(CURRENT_DIR)/$(PKG_BIN)/lib/librkAVS_genLut.so || (\
@mkdir -p $(PKG_BIN); rm -rf $(PKG_BIN) && mkdir -p $(PKG_BIN); \
cp -rfa $(PKG_TARBALL) $(PKG_BIN)/; cp -rfa $(PKG_TARBALL) $(PKG_BIN)/; \
cp -rfa include/ $(PKG_BIN)/include/; cp -rfa lib/ $(PKG_BIN)/lib/; \
cp -rfa lib/ $(PKG_BIN)/lib/; );
$(call MAROC_COPY_PKG_TO_MEDIA_OUTPUT, $(RK_MEDIA_OUTPUT), $(PKG_BIN))
clean: distclean clean: distclean

View File

@@ -0,0 +1,55 @@
<?xml version="1.0"?>
<opencv_storage>
<cameras_num>2</cameras_num>
<calib_width>1920</calib_width>
<calib_height>1080</calib_height>
<camera_type>1</camera_type>
<camera_fov>120</camera_fov>
<camera_matrix_0 type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>f</dt>
<data>
1.69224902e+03 6.69298545e-02 9.55471680e+02 0. 1.69050842e+03
5.67265076e+02 0. 0. 1.</data></camera_matrix_0>
<camera_distortion_0 type_id="opencv-matrix">
<rows>1</rows>
<cols>4</cols>
<dt>f</dt>
<data>
-4.25682127e-01 1.36122614e-01 5.35543186e-05 -2.48203811e-04</data></camera_distortion_0>
<xi_0>6.0676449537277222e-01</xi_0>
<camera_pose_0 type_id="opencv-matrix">
<rows>4</rows>
<cols>4</cols>
<dt>f</dt>
<data>
1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1.</data></camera_pose_0>
<internal_meanReprojectError>1.2691220641136169e-01</internal_meanReprojectError>
<external_meanReprojectError>0.</external_meanReprojectError>
<camera_matrix_1 type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>f</dt>
<data>
1.66993884e+03 9.04271364e-01 9.61675171e+02 0. 1.66812769e+03
5.42897095e+02 0. 0. 1.</data></camera_matrix_1>
<camera_distortion_1 type_id="opencv-matrix">
<rows>1</rows>
<cols>4</cols>
<dt>f</dt>
<data>
-4.22289699e-01 1.35094926e-01 -7.43167475e-04 4.60190611e-04</data></camera_distortion_1>
<xi_1>5.8322030305862427e-01</xi_1>
<camera_pose_1 type_id="opencv-matrix">
<rows>4</rows>
<cols>4</cols>
<dt>f</dt>
<data>
5.08073032e-01 3.64603125e-03 8.61306310e-01 5.23154373e+01
-5.16786147e-03 9.99985933e-01 -1.18462974e-03 4.08885807e-01
-8.61298501e-01 -3.84923327e-03 5.08084714e-01 -3.10678463e+01 0. 0.
0. 1.</data></camera_pose_1>
<internal_meanReprojectError>1.6073322296142578e-01</internal_meanReprojectError>
<external_meanReprojectError>1.6884341835975647e-01</external_meanReprojectError>
</opencv_storage>

View File

@@ -0,0 +1,55 @@
<?xml version="1.0"?>
<opencv_storage>
<cameras_num>2</cameras_num>
<calib_width>1920</calib_width>
<calib_height>1080</calib_height>
<camera_type>1</camera_type>
<camera_fov>107</camera_fov>
<camera_matrix_0 type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>f</dt>
<data>
3.73959448e+03 4.22944754e-01 9.48736694e+02 0. 3.73644849e+03
5.54968628e+02 0. 0. 1.</data></camera_matrix_0>
<camera_distortion_0 type_id="opencv-matrix">
<rows>1</rows>
<cols>4</cols>
<dt>f</dt>
<data>
-9.31460440e-01 1.30924940e+00 -8.51006992e-03 1.65518129e-03</data></camera_distortion_0>
<xi_0>2.3498959541320801e+00</xi_0>
<camera_pose_0 type_id="opencv-matrix">
<rows>4</rows>
<cols>4</cols>
<dt>f</dt>
<data>
1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1.</data></camera_pose_0>
<internal_meanReprojectError_0>2.8692653775215149e-01</internal_meanReprojectError_0>
<external_meanReprojectError_0>0.</external_meanReprojectError_0>
<camera_matrix_1 type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>f</dt>
<data>
3.89461865e+03 4.05671477e-01 9.99229065e+02 0. 3.88211426e+03
5.34389099e+02 0. 0. 1.</data></camera_matrix_1>
<camera_distortion_1 type_id="opencv-matrix">
<rows>1</rows>
<cols>4</cols>
<dt>f</dt>
<data>
-1.06845140e+00 1.71290421e+00 -3.96473333e-03 -5.20940730e-03</data></camera_distortion_1>
<xi_1>2.4256694316864014e+00</xi_1>
<camera_pose_1 type_id="opencv-matrix">
<rows>4</rows>
<cols>4</cols>
<dt>f</dt>
<data>
4.83097047e-01 -3.82677303e-03 8.75558436e-01 4.25615463e+01
6.23317529e-03 9.99980152e-01 9.31369665e-04 3.84045154e-01
-8.75544608e-01 5.00756735e-03 4.83111292e-01 -4.29618340e+01 0. 0.
0. 1.</data></camera_pose_1>
<internal_meanReprojectError_1>2.4914073944091797e-01</internal_meanReprojectError_1>
<external_meanReprojectError_1>7.4757629632949829e-01</external_meanReprojectError_1>
</opencv_storage>

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -1,180 +0,0 @@
// rkAVS_stitchFor1106.h: <20><>׼ϵͳ<CFB5><CDB3><EFBFBD><EFBFBD><EFBFBD>ļ<EFBFBD><C4BC>İ<EFBFBD><C4B0><EFBFBD><EFBFBD>ļ<EFBFBD>
// <20><><EFBFBD><EFBFBD>Ŀ<EFBFBD>ض<EFBFBD><D8B6>İ<EFBFBD><C4B0><EFBFBD><EFBFBD>ļ<EFBFBD><C4BC><EFBFBD>
#pragma once
#include <stdint.h>
#ifdef __cplusplus
extern "C" {
#endif
#if defined _WIN32 || defined __CYGWIN__
#ifdef BUILDING_DLL
#ifdef __GNUC__
#define DLL_PUBLIC __attribute__((dllexport))
#else
// Note: actually gcc seems to also supports this syntax.
#define DLL_PUBLIC __declspec(dllexport)
#endif
#else
#ifdef __GNUC__
#define DLL_PUBLIC __attribute__((dllimport))
#else
// Note: actually gcc seems to also supports this syntax.
#define DLL_PUBLIC
#endif
#define DLL_LOCAL
#endif
#else
#if __GNUC__ >= 4
#define DLL_PUBLIC __attribute__((visibility("default")))
#define DLL_LOCAL __attribute__((visibility("hidden")))
#else
#define DLL_PUBLIC
#define DLL_LOCAL
#endif
#endif
#define CAMERA_NUM 2
#define BAND_NUM 4
#define MESH_SCALE_BIT 4
/** @struct RK_AVS_IMAGE_SIZE
* @brief Struct of image size.
*/
typedef struct RK_AVS_IMAGE_SIZE {
int32_t s32ImageWidth; /**< Width */
int32_t s32ImageHeight; /**< Height */
} AVS_IMAGE_SIZE;
/** @struct RK_AVS_IMAGE_YUV_PLANAR
* @brief Struct of Yuv Planar format.
*/
typedef struct RK_AVS_IMAGE_YUV_PLANAR {
void *y; /**< Head pointer of the Y image */
void *u; /**< Head pointer of the U image */
void *v; /**< Head pointer of the V image */
} AVS_IMAGE_YUV_PLANAR;
/** @struct RK_AVS_IMAGE_YUV_SEMI_PLANAR
* @brief Struct of Yuv SemiPlanar format.
*/
typedef struct RK_AVS_IMAGE_YUV_SEMI_PLANAR {
void *y; /**< Head pointer of the Y image */
void *uv; /**< Head pointer of the UV image */
} AVS_IMAGE_YUV_SEMI_PLANAR;
/** @struct RK_AVS_IMAGE_DATA
* @brief Image Data.
*/
typedef struct RK_AVS_IMAGE_DATA {
AVS_IMAGE_SIZE imageSize; /**< Size of image */
union {
void *p; /**< Head pointer of the image */
AVS_IMAGE_YUV_PLANAR planar; /**< Planar struct */
AVS_IMAGE_YUV_SEMI_PLANAR semiPlanar; /**< Semiplanar struct */
} dat; /**< image data pointer */
} AVS_IMAGE_DATA;
/** @struct RK_AVS_IMAGE_YUV_PLANAR_PITCH
* @brief Pitch value of Yuv Planar format.
*/
typedef struct RK_AVS_IMAGE_YUV_PLANAR_PITCH {
int32_t y; /**< Y data. Number of bytes from head of line to the head of next line */
int32_t u; /**< U data. Number of bytes from head of line to the head of next line */
int32_t v; /**< V data. Number of bytes from head of line to the head of next line */
} AVS_IMAGE_YUV_PLANAR_PITCH;
/** @struct RK_AVS_IMAGE_YUV_SEMI_PLANAR_PITCH
* @brief Pitch value of Yuv SemiPlanar format.
*/
typedef struct RK_AVS_IMAGE_YUV_SEMI_PLANAR_PITCH {
int32_t y; /**< Y data. Number of bytes from head of line to the head of next line */
int32_t
uv; /**< UV data. Number of bytes from head of line to the head of next line */
} AVS_IMAGE_YUV_SEMI_PLANAR_PITCH;
/** @struct RK_AVS_IMAGE_DATA_EX
* @brief Image Data.
*/
typedef struct RK_AVS_IMAGE_DATA_EX {
AVS_IMAGE_SIZE imageSize; /**< Size of image */
int32_t fd; /**< file descriptor */
union {
void *p; /**< Head pointer of the image data */
AVS_IMAGE_YUV_PLANAR planar; /**< Planar struct */
AVS_IMAGE_YUV_SEMI_PLANAR semiPlanar; /**< Semiplanar struct */
} dat; /**< image data pointer */
union {
int32_t p; /**< Number of bytes from head of line to the head of next line */
AVS_IMAGE_YUV_PLANAR_PITCH planar; /**< Planar pitch struct */
AVS_IMAGE_YUV_SEMI_PLANAR_PITCH semi_planar; /**< Semiplanar pitch struct */
} pitch; /**< image pitch union */
} AVS_IMAGE_DATA_EX;
/** @struct RK_AVS_STICH_PARAMS
* @brief Struct of image stitch params.
*/
typedef struct RK_AVS_STICH_BLOCK_PARAMS {
int32_t s32SrcOverlapStartY;
int32_t s32SrcNonverlapStartY;
int32_t s32SrcOverlapHeight;
int32_t s32SrcNonverlapHeight;
int32_t s32DstOverlapStartY;
int32_t s32DstNonverlapStartY;
int32_t s32DstOverlapHeight;
int32_t s32DstNonverlapHeight;
} AVS_STICH_BLOCK_PARAMS;
typedef struct RK_AVS_STITCH_ROI_PARAMS {
int32_t s32SrcCopyStartX;
int32_t s32SrcCopyStartY;
int32_t s32SrcCopyHeight;
int32_t s32SrcCopyWidth;
int32_t s32DstCopyStartX;
int32_t s32DstCopyStartY;
int32_t s32DstCopyHeight;
int32_t s32DstCopyWidth;
int32_t s32SrcRemapStartX;
int32_t s32SrcRemapStartY;
int32_t s32SrcRemapHeight;
int32_t s32SrcRemapWidth;
} AVS_STITCH_ROI_PARAMS;
typedef struct rkAVS_STITCH_PARAMS_S {
int32_t s32BandNum;
AVS_IMAGE_SIZE stStitchImageSize;
AVS_IMAGE_SIZE stAlphaSize;
AVS_STITCH_ROI_PARAMS stStitchRoiParams[CAMERA_NUM];
AVS_STICH_BLOCK_PARAMS stStitchBlockParams[CAMERA_NUM];
} AVS_STITCH_PARAMS_S;
/** @struct RK_AVS_INPUT_PARAMS
* @brief Struct of input params.
*/
typedef struct RK_AVS_INPUT_PARAMS {
int32_t s32LogLevel;
uint8_t *pu8LeftMesh;
uint8_t *pu8RightMesh;
uint8_t *pu8AlphaYuv;
uint8_t *pu8TmpBuffer;
AVS_STITCH_PARAMS_S stStitchParams;
} AVS_INPUT_PARAMS;
typedef struct RK_AVS_ENGINE {
void *p;
int32_t s32AsynchSelect;
} AVS_ENGINE;
DLL_PUBLIC int32_t rkAVS_getVersion(char avsToolVersion[128]);
DLL_PUBLIC int32_t rkAVS_initParams(AVS_ENGINE *stEngine);
DLL_PUBLIC int32_t rkAVS_stitchImages(AVS_ENGINE *stEngine,
AVS_IMAGE_DATA_EX *stLeftImageEx,
AVS_IMAGE_DATA_EX *stRightImageEx,
AVS_IMAGE_DATA_EX *stStitchImageEx,
AVS_INPUT_PARAMS *stInputParams);
DLL_PUBLIC int32_t rkAVS_destroy(AVS_ENGINE *stEngine);
#ifdef __cplusplus
} /* extern "C" { */
#endif

BIN
media/avs/lib/librkAVS_genLut.so Executable file

Binary file not shown.

Binary file not shown.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

Binary file not shown.

File diff suppressed because it is too large Load Diff

View File

@@ -72,7 +72,14 @@ export CONFIG_RK_CRYPTO=n
# Enable libv4l # Enable libv4l
# export CONFIG_LIBV4L=y # export CONFIG_LIBV4L=y
# liblvgl ##------------------------------------------------
export CONFIG_LVGL=n # Rockchip's avs
#
# Enable avs Build
export CONFIG_RK_AVS=y export CONFIG_RK_AVS=y
#------------------------------------------------
# Rockchip's auto
#
# Enable auto Build
export CONFIG_RK_ROCKAUTO=y

View File

@@ -16,6 +16,8 @@ ifeq ($(PKG_BIN),)
$(error ### $(CURRENT_DIR): PKG_BIN is NULL, Please Check !!!) $(error ### $(CURRENT_DIR): PKG_BIN is NULL, Please Check !!!)
endif endif
PKG_CONF_OPTS += -DCMAKE_C_FLAGS="$(RK_MEDIA_OPTS)" -DCMAKE_CXX_FLAGS="$(RK_MEDIA_OPTS)"
ifeq ($(RK_MEDIA_CHIP), rk3588) ifeq ($(RK_MEDIA_CHIP), rk3588)
PKG_CONF_OPTS += -DUSE_64BIT=TRUE PKG_CONF_OPTS += -DUSE_64BIT=TRUE
PKG_CONF_OPTS += -DUSE_32BIT=FALSE PKG_CONF_OPTS += -DUSE_32BIT=FALSE
@@ -23,6 +25,9 @@ PKG_CONF_OPTS += -DAEC_ANR_AGC_ENABLE=TRUE
PKG_CONF_OPTS += -DANR_ENABLE=TRUE PKG_CONF_OPTS += -DANR_ENABLE=TRUE
PKG_CONF_OPTS += -DMOVE_DETECT_ENABLE=TRUE PKG_CONF_OPTS += -DMOVE_DETECT_ENABLE=TRUE
PKG_CONF_OPTS += -DOCCLUSION_DETECT_ENABLE=TRUE PKG_CONF_OPTS += -DOCCLUSION_DETECT_ENABLE=TRUE
PKG_CONF_OPTS += -DRKAPPLUS_ENABLE=TRUE
PKG_CONF_OPTS += -DRKAPPLUS_WAKEUP_ENABLE=FALSE
PKG_CONF_OPTS += -DRKAPPLUS_AED_ENABLE=TRUE
PKG_CONF_OPTS += -DRK3588=TRUE PKG_CONF_OPTS += -DRK3588=TRUE
endif endif
@@ -33,6 +38,9 @@ PKG_CONF_OPTS += -DAEC_ANR_AGC_ENABLE=TRUE
PKG_CONF_OPTS += -DANR_ENABLE=TRUE PKG_CONF_OPTS += -DANR_ENABLE=TRUE
PKG_CONF_OPTS += -DMOVE_DETECT_ENABLE=TRUE PKG_CONF_OPTS += -DMOVE_DETECT_ENABLE=TRUE
PKG_CONF_OPTS += -DOCCLUSION_DETECT_ENABLE=TRUE PKG_CONF_OPTS += -DOCCLUSION_DETECT_ENABLE=TRUE
PKG_CONF_OPTS += -DRKAPPLUS_ENABLE=TRUE
PKG_CONF_OPTS += -DRKAPPLUS_WAKEUP_ENABLE=FALSE
PKG_CONF_OPTS += -DRKAPPLUS_AED_ENABLE=TRUE
PKG_CONF_OPTS += -DRV1126_RV1109=TRUE PKG_CONF_OPTS += -DRV1126_RV1109=TRUE
endif endif
@@ -41,6 +49,9 @@ PKG_CONF_OPTS += -DUSE_64BIT=FALSE
PKG_CONF_OPTS += -DUSE_32BIT=FALSE PKG_CONF_OPTS += -DUSE_32BIT=FALSE
PKG_CONF_OPTS += -DUSE_UCLIBC=TRUE PKG_CONF_OPTS += -DUSE_UCLIBC=TRUE
PKG_CONF_OPTS += -DRKAPPLUS_ENABLE=TRUE PKG_CONF_OPTS += -DRKAPPLUS_ENABLE=TRUE
PKG_CONF_OPTS += -DRKAPPLUS_WAKEUP_ENABLE=FALSE
PKG_CONF_OPTS += -DRKAPPLUS_WAKEUP_MODE1_NN_ENABLE=FALSE
PKG_CONF_OPTS += -DRKAPPLUS_WAKEUP_MODE2_NN_ENABLE=FALSE
PKG_CONF_OPTS += -DRKAPPLUS_AED_ENABLE=TRUE PKG_CONF_OPTS += -DRKAPPLUS_AED_ENABLE=TRUE
PKG_CONF_OPTS += -DRV1106_RV1103=TRUE PKG_CONF_OPTS += -DRV1106_RV1103=TRUE
endif endif

View File

@@ -13,16 +13,10 @@ if (ANR_ENABLE)
add_subdirectory(rkap_anr) add_subdirectory(rkap_anr)
endif() endif()
if (RKAPPLUS_ENABLE) if(RKAPPLUS_AED_ENABLE OR RKAPPLUS_ENABLE OR RKAPPLUS_EFFECT_EQDRC_ENABLE)
add_subdirectory(rkskv) add_subdirectory(rkaudio_algorithms)
endif() endif()
if (RKAPPLUS_AED_ENABLE) if(ROCKAA_ENABLE)
add_subdirectory(rkaed) add_subdirectory(rockaa)
endif()
if (USE_UCLIBC)
if(RKAPPLUS_AED_ENABLE OR RKAPPLUS_ENABLE)
add_subdirectory(rkaudio_algorithms)
endif()
endif() endif()

View File

@@ -1,29 +0,0 @@
#1.cmake version
cmake_minimum_required(VERSION 3.2)
#2.project name
project(RKAED)
# ----------------------------------------------------------------------------
# install headers
# ----------------------------------------------------------------------------
install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/include/
DESTINATION "include"
FILES_MATCHING PATTERN "*.h"
)
# ----------------------------------------------------------------------------
# install libs
# ----------------------------------------------------------------------------
if (USE_32BIT)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib32/librkaudio_detect.so
DESTINATION "lib"
)
endif()
if (USE_64BIT)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib64/librkaudio_detect.so
DESTINATION "lib"
)
endif()

View File

@@ -1,161 +0,0 @@
#ifdef _MSC_VER
#include "audio/wave_reader.h"
#include "audio/wave_writer.h"
#include "skv/rkaudio_sed.h"
#else
#include "wave_reader.h"
#include "wave_writer.h"
#include "rkaudio_sed.h"
#endif
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <string>
#include <string.h>
#include <assert.h>
#include <iostream>
#define IN_SIZE 256
double clk_sed;
static clock_t clk_start, clk_end;
int g_mem_cost;
typedef unsigned char SKV_BYTE;
int main(int argc, char* argv[])
{
int ret = 0;
double Total_time_s = 0.0, Total_time_e = 0.0;
double Total_sample = 0.0;
double Tmp_sample = 0.0;
// 读取数据并处理
clock_t startTime, endTime;
#if 1
// 输入参数
argc = 3;
#if ENABLE_8k
argv[1] = (char*)"babycry_8k.wav";
argv[2] = (char*)"babycry_8k_out.wav";
#else
argv[1] = (char*)"total.wav";
argv[2] = (char*)"total_out.wav";
#endif
#endif
/* For Debug */
int aed_stat = 0;
int out_size = 0, in_size = 0, res = 0;
int bcd_stat = 0, bcd_stat_old = 0;
if (argc < 3)
{
printf("Error: Wrong input parameters! A example is as following: \n");
printf("fosafer_enh test_in.wav test_out.wav 0\n");
exit(-1);
}
char* in_filename = argv[1];
char* out_filename = argv[2];
// for wave reader
wave_reader* wr;
wave_reader_error rerror;
// for wave writer
wave_writer* ww;
wave_writer_error werror;
wave_writer_format format;
// 读取输入音频
wr = wave_reader_open(in_filename, &rerror);
if (!wr) {
printf("rerror=%d\n", rerror);
return -1;
}
int mSampleRate = wave_reader_get_sample_rate(wr);
int mBitPerSample = wave_reader_get_sample_bits(wr);
int mNumChannel = wave_reader_get_num_channels(wr);
// 输入检查
if (mNumChannel > 1) {
printf("This algorithm is a single channel algorithm and will run on the first channel of data\n");
}
// 每次读取数据大小
int read_size = IN_SIZE * mNumChannel * mBitPerSample / 8;
SKV_BYTE* in = (SKV_BYTE*)malloc(read_size * sizeof(SKV_BYTE));
int out_res_num = 5;
short* out = (short*)malloc(IN_SIZE * out_res_num * sizeof(short));
// 输出音频格式设置
format.num_channels = 5;
format.sample_rate = mSampleRate;
format.sample_bits = mBitPerSample;
ww = wave_writer_open(out_filename, &format, &werror);
// 输出音频建立失败
if (!ww)
{
printf("werror=%d\n", werror);
wave_reader_close(wr);
return -1;
}
// 声音事件检测初始化
RKAudioSedRes sed_res;
RKAudioSedParam* sed_param = rkaudio_sed_param_init();
void* st_sed = rkaudio_sed_init(mSampleRate, mBitPerSample, mNumChannel, sed_param);
rkaudio_sed_param_destroy(sed_param);
if (st_sed == NULL) {
printf("Failed to create baby cry handle\n");
return -1;
}
startTime = clock();
int cnt = 0;
while (0 < (res = wave_reader_get_samples(wr, IN_SIZE, in)))
{
in_size = res * (mBitPerSample / 8) * mNumChannel;
cnt++;
clk_start = clock();
out_size = rkaudio_sed_process(st_sed, (short*)in, in_size / 2, &sed_res);
if (out_size < 0)
fprintf(stderr, "bcd process return error=%d\n", out_size);
clk_end = clock();
clk_sed += clk_end - clk_start;
//printf("lsd=%d,snr=%d,bcd=%d,buz_res=%d,gbs_res=%d\n", sed_res.lsd_res, sed_res.snr_res, sed_res.bcd_res, sed_res.buz_res, sed_res.gbs_res);
// 输出,测试用
for (int i = 0; i < IN_SIZE; i++) {
*(out + out_res_num * i) = 10000 * sed_res.snr_res;
*(out + out_res_num * i + 1) = 10000 * sed_res.lsd_res;
*(out + out_res_num * i + 2) = 10000 * sed_res.bcd_res;
*(out + out_res_num * i + 3) = 10000 * sed_res.buz_res;
*(out + out_res_num * i + 4) = 10000 * sed_res.gbs_res;
}
wave_writer_put_samples(ww, IN_SIZE, out);
Total_sample += in_size / 2 / mNumChannel;
//if (cnt % 63 == 0)
// printf("cnt = %d\n", cnt);
}
endTime = clock();
printf("Finished, speech_time = %f, cost_time = %f\n", \
Total_sample / mSampleRate, (double)(endTime - startTime) / CLOCKS_PER_SEC);
printf("sed = %f\n", clk_sed / CLOCKS_PER_SEC);
wave_writer_close(ww, &werror);
wave_reader_close(wr);
free(in);
free(out);
// 释放
if (st_sed)
rkaudio_sed_destroy(st_sed);
return 0;
}

View File

@@ -1,412 +0,0 @@
# Rockchip Sound Event Detection开发文档
文件标识RK-KF-SF-959
发布版本V1.1.0
日期2022-12-15
文件密级:□绝密 □秘密 □内部资料 ■公开
**免责声明**
本文档按“现状”提供,瑞芯微电子股份有限公司(“本公司”,下同)不对本文档的任何陈述、信息和内容的准确性、可靠性、完整性、适销性、特定目的性和非侵权性提供任何明示或暗示的声明或保证。本文档仅作为使用指导的参考。
由于产品版本升级或其他原因,本文档将可能在未经任何通知的情况下,不定期进行更新或修改。
**商标声明**
“Rockchip”、“瑞芯微”、“瑞芯”均为本公司的注册商标归本公司所有。
本文档可能提及的其他所有注册商标或商标,由其各自拥有者所有。
**版权所有 © 2022 瑞芯微电子股份有限公司**
超越合理使用范畴,非经本公司书面许可,任何单位和个人不得擅自摘抄、复制本文档内容的部分或全部,并不得以任何形式传播。
瑞芯微电子股份有限公司
Rockchip Electronics Co., Ltd.
地址: 福建省福州市铜盘路软件园A区18号
网址: [www.rock-chips.com](http://www.rock-chips.com)
客户服务电话: +86-4007-700-590
客户服务传真: +86-591-83951833
客户服务邮箱: [fae@rock-chips.com](mailto:fae@rock-chips.com)
---
**产品版本**
| **芯片名称** | **内核版本** |
| ------------ | ------------ |
| 全系列 | 通用 |
**读者对象**
本文档(本指南)主要适用于以下工程师:
技术支持工程师
软件开发工程师
**修订记录**
| **版本号** | **作者** | **修改日期** | **修改说明** |
| ---------- | ------------ | :----------- | --------------------------------- |
| V1.0.0 | 廖华平、江迪 | 2022-07-23 | 初始版本 |
| V1.0.1 | 廖华平、郑兴 | 2022-08-15 | 整理文档格式 |
| V1.0.2 | 廖华平 | 2022-08-20 | 更新接口,加入蜂鸣器检测 |
| V1.1.0 | 赖陈潇 | 2022-12-15 | 更新接口加入AGC及玻璃破碎声检测 |
---
**目录**
[TOC]
---
## 概述
本文档主要描述声音事件检测(Sound Event Detection)功能。当前包含婴儿哭声检测(Baby Cry Detection)、异常声检测(Abnormal Event Detection)和蜂鸣器检测(Buzz Detection)。三个模块通过统一接口调用,但功能相互独立,可通过开关使能关闭其中任何模块。
## 功能描述
### Abnormal Event Detection(AED)
AED实现实时异常声检测功能包括超大声检测和信噪比检测。超大声检测实现对dB的检测超过设定的dB值输出1否则输出0。信噪比检测主要实现对噪声和信号进行检测这里说的噪声主要是环境中的平稳噪声和录音的底噪当信噪比大于设定阈值后输出1否则为0。。
分贝decibel是量度两个相同单位之数量比例的计量单位幅值为我们声音数据大小的绝对值分贝(dB)和幅值X的关系为
$$
dB=20 * log10(X)
$$
所以幅值为倍数关系dB为加减关系。幅值每上升一倍dB值上升6dB。16bit的音频数据满幅值32767此值设为0dB所以我们这里说的dB值都在0dB以下。
信噪比(SNR)可理解为信号与噪声的比值噪声设为0dB信号比噪声高6dB那么此时的信噪比为6dB。
### AI声音事件检测Sound Event Detection(SED)
SED模块实现对婴儿哭声蜂鸣器报警声及玻璃破碎声的实时检测。采用带有多头注意力机制的RCNN模型对约1.5s时间内的声音信息进行分析从而实现对上述声音事件的检测模块在信息信噪比高于6dB时有较好的效果。
#### Baby Cry Detection(BCD)
BCD实现实时检测婴儿哭声的功能。通过深度学习的方式进行婴儿哭声检测信噪比高的时候效果较好从婴儿哭声出现开始计算检测延迟约2s。
#### Buzz Detection(BUZ)
BUZ实现实时检测蜂鸣器报警声的功能。主要检测常见的警报声包括烟雾报警、防空报警、防盗报警等。通过深度学习的方式进行警报声检测信噪比高的时候效果较好从蜂鸣器报警声出现开始计算检测延迟约2s。
#### Glass broken Detection(GBS)
GBS实现实时检测玻璃破碎声的功能。通过深度学习的方式进行玻璃破碎声声检测信噪比高的时候效果较好从玻璃破碎声出现开始计算检测延迟约0.6s。
## 相关API介绍
该功能模块为用户提供以下API:
- [**rkaudio_sed_init**](#rkaudio_sed_init)SED初始化。
- [**rkaudio_sed_destroy**](#rkaudio_sed_destroy)SED销毁。
- [**rkaudio_sed_process**](#rkaudio_sed_process)SED执行。
### rkaudio_sed_init
【描述】
初始化并返回SED的操作句柄此句柄用于[rkaudio_sed_process](#rkaudio_sed_process)。使用结束后,执行[rkaudio_sed_destroy](#rkaudio_sed_destroy)销毁。
【语法】
void *rkaudio_sed_init(int fs, int bit, int chan, [RKAudioSedParam](#RKAudioSedParam) *param)
【参数】
| 参数名 | 描述 | 输入/输出 |
| ------ | ------------------------------------------------------------ | --------- |
| fs | 采样率AED支持8k和16kBCD、BUZ及GBS只支持16k。 | 输入 |
| bit | 每个数据的bit数一般使用的都是16bit数据。 | 输入 |
| chan | 通道数,如果输入多通道数据,使用的是第一个通道的数据。 | 输入 |
| param | SED参数相关定义见[RKAudioSedParam](#RKAudioSedParam)。可通过函数[rkaudio_sed_param_init](#rkaudio_sed_param_init)构建,也可自行构建相关函数和初始化系数。 | 输入 |
【返回值】
| 返回值 | 描述 |
| ------ | ------ |
| NULL | 失败。 |
| 非NULL | 成功。 |
### rkaudio_sed_destroy
【描述】
销毁SED句柄。
【语法】
void rkaudio_sed_destroy(void *st_)
【参数】
| 参数名 | 描述 | 输入/输出 |
| ------ | ------ | --------- |
| st | 句柄。 | 输入 |
【返回值】
### rkaudio_sed_process
【描述】
进行声音事件检测返回结果存于res中。
【语法】
int rkaudio_sed_process(void *st_, short *in, int in_size, [RKAudioSedRes](#RKAudioSedRes) *res)
【参数】
| 参数名 | 描述 | 输入/输出 |
| ------- | ------------------------------------------------------------ | --------- |
| st_ | 句柄。 | 输入 |
| in | 输入数据的指针。 | 输入 |
| in_size | 输入数据的长度8k数据size应为128的倍数16k数据size应为256的倍数。 | 输入 |
| res | 检测结果结构体指针,此结构体需在外部申请,定义参见[RKAudioSedRes](#RKAudioSedRes)。 | 输出 |
【返回值】
| 返回值 | 描述 |
| --------- | ------------------------------------ |
| 大于等于0 | 执行成功,此返回值为执行数据的长度。 |
| 小于0 | 执行失败。 |
### rkaudio_sed_param_init
【描述】
初始化SED模块参数进行子模块使能并调用各个子模块参数初始化函数。此函数源码对外开放并且各参数默认值已设置也可根据实际数据进行适当调整。使用完后调用[rkaudio_sed_param_destroy](#rkaudio_sed_param_destroy)销毁。如果在SED调用过程中要对参数或者模块使能进行调整需要将SED模块销毁后重新初始化才能生效。
【语法】
RKAudioSedParam *rkaudio_sed_param_init()
【返回值】
SED参数指针定义参见[RKAudioSedParam](#RKAudioSedParam)。
### rkaudio_sed_param_destroy
【描述】
销毁SED模块参数。
【语法】
void rkaudio_sed_param_destroy([RKAudioSedParam](#RKAudioSedParam) *param)
【参数】
SED参数指针。
### rkaudio_sed_param_aed
【描述】
初始化AED模块参数在rkaudio_sed_param_init函数中调用在rkaudio_sed_param_destroy函数中销毁。
【语法】
[SedAedParam](#SedAedParam ) *rkaudio_sed_param_aed()
【返回值】
AED模块参数指针。
### rkaudio_sed_param
【描述】
初始化BCD/BUZ/GBS模块参数在rkaudio_sed_param_init函数中调用在rkaudio_sed_param_destroy函数中销毁。
【语法】
[SedBuzParam](#SedBuzParam ) *rkaudio_sed_param()
【返回值】
BUZ模块参数指针。
### rkaudio_agc_param_init
【描述】
初始化AGC参数在rkaudio_sed_param_init函数中调用在rkaudio_sed_param_destroy函数中销毁。
【语法】
RKAGCParam* rkaudio_agc_param_init()
【返回值】
AGC模块参数指针。
## 参数介绍
### JUMP_FRAME
【说明】
SED算法相关参数检测间隔帧数数值越高则每秒检测频率越低同时计算量越低。以数值20为例则检测间隔为0.016*20 = 0.32s即差不多平均一秒检测3次建议数值15-25。被跳过帧得检测结果会与上一帧保持一致。
【定义】
```c
#define JUMP_FRAME 20
```
### RKAudioSedParam
【说明】
SED算法相关参数。
【定义】
```c
typedef struct RKAudioSedParam_
{
int model_en;
SedAedParam *aed_param;
SedBcdParam *sed_param;
} RKAudioSedParam;
```
【成员】
| 成员名称 | 描述 |
| ----------- | ------------------------------------------------------------ |
| model_en | 通过设置bit位开启子模块各bit定义参见[RKSedEnable](#RKSedEnable)<br/>如要开启AED和BCD则应设为EN_AED \| EN_BCD。 |
| SedAedParam | AED模块参数定义参见[SedAedParam](#SedAedParam)。 |
| SedParam | SED模块参数定义参见[SedParam](#SedBcdParam)。 |
### SedAedParam
【说明】
AED算法相关参数。
【定义】
```c
typedef struct SedAedParam_
{
float snr_db; // 信噪比大于snr输出1单位为db
float lsd_db; // 响度大于db值输出1, 最高为0db
int policy; // vad灵敏度0—>2 灵敏度等级提升。默认为1.
} SedAedParam;
```
【成员】
| 成员名称 | 描述 |
| -------- | ------------------------------------------------------------ |
| snr_db | 语音信噪比阈值大于则输出1。 |
| lsd_db | 超大声阈值大于则输出1。最大为0dB。 |
| policy | 信噪比检测算法灵敏度,取指范围为[02]值越大越灵敏越容易满足检测阈值。默认取1。 |
### SedParam
【说明】
BCD/BUZ/GBS算法相关参数。
【定义】
```c
typedef struct SedParam_
{
int frm_len; // 统计帧长
int nclass; // 类别数目
int babycry_decision_len; // 哭声确认帧长
int buzzer_decision_len; //蜂鸣器确认帧长
int glassbreaking_decision_len; //玻璃破碎声确认帧长
} SedParam;
```
【成员】
| 成员名称 | 描述 |
| -------------------------- | ------------------------------------------------------------ |
| frm_len | 统计的总帧数建议数值110-150越长检测延迟越高越低越容易漏检测或误检测。 |
| nclass | 关注的分类的总类别数目固定值为3不可修改。 |
| babycry_decision_len | 哭声检测确认帧长数值应小于frm_len建议长度100越长检测延迟越高越容易漏检测越短越容易误检测。 |
| buzzer_decision_len | 蜂鸣器报警声检测确认帧长数值应小于frm_len建议长度100越长检测延迟越高越容易漏检测越短越容易误检测。 |
| glassbreaking_decision_len | 玻璃破碎声检测确认帧长数值应小于frm_len大于JUMP_FRAME建议长度25-50越长检测延迟越高越容易漏检测越短越容易误检测。 |
### RKAudioSedRes
【说明】
SED模块返回结果。
【定义】
```c
typedef struct RKAudioSedRes_ {
int snr_res;
int lsd_res;
int bcd_res;
int buz_res;
int gbs_res;
} RKAudioSedRes;
```
【成员】
| 成员名称 | 描述 |
| -------- | --------------------------------------------- |
| snr_res | SNR返回结果1为满足信噪比阈值0不满足。 |
| lsd_res | LSD返回结果1为满足超大声阈值0不满足。 |
| bcd_res | BCD返回结果1为检测到哭声0没检测到。 |
| buz_res | BUZ返回结果1为检测到警报声0没检测到。 |
| gbs_res | GBS返回结果1为检测到玻璃破碎声0没检测到。 |
### RKAudioSedEnable
【说明】
使能各模块将此值赋给model_en则可使能对应模块。如要使能多个模块则使用EN_AED | EN_BCD方式。
【定义】
```c
typedef enum RKAudioSedEnable_
{
EN_AGC = 1 << 0,
EN_AED = 1 << 1,
EN_SED = 1 << 2,
} RKAudioSedEnable;
```
【成员】
| 成员名称 | 描述 |
| -------- | --------------------------------------- |
| EN_AGC | 使能AGC模块建议在接收音量较小时开启。 |
| EN_AED | 使能AED模块。 |
| EN_SED | 使能SED模块。 |
### 其他参数
【说明】
AGC等其他参数不建议改动

View File

@@ -1,168 +0,0 @@
/* Copyright (C) RK
Written by Ryne
Date : 20221214*/
#ifndef RKAUDIO_SED_H
#define RKAUDIO_SED_H
#include <stdlib.h>
#ifdef __cplusplus
extern "C" {
#endif
#define JUMP_FRAME 20
typedef struct RKAGCParam_ {
/* 新版AGC参数 */
float attack_time; /* 触发时间即AGC增益下降所需要的时间 */
float release_time; /* 施放时间即AGC增益上升所需要的时间 */
float max_gain; /* 最大增益同时也是线性段增益单位dB */
float max_peak; /* 经AGC处理后输出语音的最大能量范围单位dB */
float fRth0; /* 扩张段结束能量dB阈值同时也是线性段开始阈值 */
float fRk0; /* 扩张段斜率 */
float fRth1; /* 压缩段起始能量dB阈值同时也是线性段结束阈值 */
/* 无效参数 */
int fs; /* 数据采样率 */
int frmlen; /* 处理帧长 */
float attenuate_time; /* 噪声衰减时间即噪声段增益衰减到1所需的时间 */
float fRth2; /* 压缩段起始能量dB阈值 */
float fRk1; /* 扩张段斜率 */
float fRk2; /* 扩张段斜率 */
float fLineGainDb; /* 线性段提升dB数 */
int swSmL0; /* 扩张段时域平滑点数 */
int swSmL1; /* 线性段时域平滑点数 */
int swSmL2; /* 压缩段时域平滑点数 */
} RKAGCParam;
inline static RKAGCParam* rkaudio_agc_param_init()
{
RKAGCParam* param = (RKAGCParam*)malloc(sizeof(RKAGCParam));
/* 新版AGC参数 */
param->attack_time = 200.0; /* 触发时间即AGC增益上升所需要的时间 */
param->release_time = 200.0; /* 施放时间即AGC增益下降所需要的时间 */
//param->max_gain = 35.0; /* 最大增益同时也是线性段增益单位dB */
param->max_gain = 25; /* 最大增益同时也是线性段增益单位dB */
param->max_peak = -1.0; /* 经AGC处理后输出语音的最大能量范围单位dB */
param->fRk0 = 2; /* 扩张段斜率 */
param->fRth2 = -40; /* 压缩段起始能量dB阈值同时也是线性段结束阈值注意 fRth2 + max_gain < max_peak */
param->fRth1 = -65; /* 扩张段结束能量dB阈值同时也是线性段开始阈值 */
param->fRth0 = -70; /* 噪声门阈值 */
/* 无效参数 */
param->fs = 16000; /* 数据采样率 */
param->frmlen = 256; /* 处理帧长 */
param->attenuate_time = 1000; /* 噪声衰减时间即噪声段增益衰减到1所需的时间 */
param->fRk1 = 0.8; /* 扩张段斜率 */
param->fRk2 = 0.4; /* 扩张段斜率 */
param->fLineGainDb = -25.0f; /* 低于该值起始的attenuate_time(ms)内不做增益 */
param->swSmL0 = 40; /* 扩张段时域平滑点数 */
param->swSmL1 = 80; /* 线性段时域平滑点数 */
param->swSmL2 = 80; /* 压缩段时域平滑点数 */
return param;
}
typedef struct SedAedParam_
{
float snr_db; // 信噪比大于snr输出1单位为db
float lsd_db; // 响度大于db值输出1, 最高为0db
int policy; // vad灵敏度0—>2 灵敏度等级提升。默认为1.
} SedAedParam;
// 声音事件检测
typedef struct SedParam_
{
int frm_len; // 统计帧长 建议长度110-150
int nclass; // 类别数目
int babycry_decision_len; // 哭声确认帧长
int buzzer_decision_len; //蜂鸣器确认帧长
int glassbreaking_decision_len; //玻璃破碎声确认帧长
} SedParam;
typedef struct RKAudioSedRes_ {
int snr_res;
int lsd_res;
int bcd_res;
int buz_res;
int gbs_res;
} RKAudioSedRes;
typedef enum RKAudioSedEnable_
{
EN_AGC = 1 << 0,
EN_AED = 1 << 1,
EN_SED = 1 << 2,
} RKAudioSedEnable;
typedef struct RKAudioSedParam_
{
int model_en;
RKAGCParam* agc_param;
SedAedParam *aed_param;
SedParam* sed_param;
} RKAudioSedParam;
static SedAedParam *rkaudio_sed_param_aed() {
SedAedParam* param = (SedAedParam *)calloc(sizeof(SedAedParam), 1);
param->snr_db = 10;
param->lsd_db = -25;
param->policy = 1;
return param;
}
static SedParam* rkaudio_sed_param()
{
SedParam* param = (SedParam*)malloc(sizeof(SedParam));
param->frm_len = 125;
param->nclass = 3;
param->babycry_decision_len = 100;
param->buzzer_decision_len = 100;
param->glassbreaking_decision_len = 30;
return param;
}
inline static RKAudioSedParam *rkaudio_sed_param_init()
{
RKAudioSedParam *param = (RKAudioSedParam *)calloc(sizeof(RKAudioSedParam), 1);
param->model_en = EN_AGC | EN_AED | EN_SED ;
param->agc_param = rkaudio_agc_param_init();
param->aed_param = rkaudio_sed_param_aed();
param->sed_param = rkaudio_sed_param();
return param;
}
inline static void rkaudio_sed_param_destroy(RKAudioSedParam *param)
{
if (param == NULL)
return;
if (param->agc_param)
{
free(param->agc_param);
param->agc_param = NULL;
}
if (param->aed_param)
{
free(param->aed_param);
param->aed_param = NULL;
}
if (param->sed_param)
{
free(param->sed_param);
param->sed_param = NULL;
}
free(param);
}
void *rkaudio_sed_init(int fs, int bit, int chan, RKAudioSedParam *param);
void rkaudio_sed_destroy(void *st_);
int rkaudio_sed_process(void *st_, short *in, int in_size, RKAudioSedRes *res);
#ifdef __cplusplus
}
#endif
/** @}*/
#endif

View File

@@ -2,12 +2,12 @@
cmake_minimum_required(VERSION 3.2) cmake_minimum_required(VERSION 3.2)
#2.project name #2.project name
project(RKAUDIOUCLIBC) project(RKAUDIOALGORITHMS)
# ---------------------------------------------------------------------------- # ----------------------------------------------------------------------------
# install resource # install resource
# ---------------------------------------------------------------------------- # ----------------------------------------------------------------------------
install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/ install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/conf/
DESTINATION "vqefiles" DESTINATION "vqefiles"
FILES_MATCHING PATTERN "*.json" FILES_MATCHING PATTERN "*.json"
) )
@@ -15,32 +15,185 @@ install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/
# ---------------------------------------------------------------------------- # ----------------------------------------------------------------------------
# install headers # install headers
# ---------------------------------------------------------------------------- # ----------------------------------------------------------------------------
install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/ install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/include/
DESTINATION "include" DESTINATION "include"
FILES_MATCHING PATTERN "*.h" FILES_MATCHING PATTERN "*.h"
) )
# ---------------------------------------------------------------------------- # ----------------------------------------------------------------------------
# install libs # install libs
# ---------------------------------------------------------------------------- # ----------------------------------------------------------------------------
if (RKAPPLUS_ENABLE) if (USE_UCLIBC)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/libaec_bf_process.so if(RKAPPLUS_AED_ENABLE OR RKAPPLUS_ENABLE)
${CMAKE_CURRENT_SOURCE_DIR}/libaec_bf_process.a install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/librkaudio_common.so
DESTINATION "lib" ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/librkaudio_common.a
) DESTINATION "lib"
)
install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/conf/
DESTINATION "vqefiles"
FILES_MATCHING PATTERN "*.rknn"
)
endif()
if(RKAPPLUS_ENABLE)
if (RKAPPLUS_WAKEUP_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/libaec_bf_process_wakeup.so
DESTINATION "lib"
RENAME libaec_bf_process.so
)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/libaec_bf_process_wakeup.a
DESTINATION "lib"
RENAME libaec_bf_process.a
)
elseif (RKAPPLUS_AINR_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/libaec_bf_process_ainr.so
DESTINATION "lib"
RENAME libaec_bf_process.so
)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/libaec_bf_process_ainr.a
DESTINATION "lib"
RENAME libaec_bf_process.a
)
elseif (RKAPPLUS_WAKEUP_MODE1_NN_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/../../../../iva/iva/librockiva/rockiva-rv1106-Linux/lib/librknnmrt.so
${CMAKE_CURRENT_SOURCE_DIR}/../../../../iva/iva/librockiva/rockiva-rv1106-Linux/lib/librknnmrt.a
DESTINATION "lib"
)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/libaec_bf_process_wakeup_mode1_nn.so
DESTINATION "lib"
RENAME libaec_bf_process.so
)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/libaec_bf_process_wakeup_mode1_nn.a
DESTINATION "lib"
RENAME libaec_bf_process.a
)
elseif (RKAPPLUS_WAKEUP_MODE2_NN_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/../../../../iva/iva/librockiva/rockiva-rv1106-Linux/lib/librknnmrt.so
${CMAKE_CURRENT_SOURCE_DIR}/../../../../iva/iva/librockiva/rockiva-rv1106-Linux/lib/librknnmrt.a
DESTINATION "lib"
)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/libaec_bf_process_wakeup_mode2_nn.so
DESTINATION "lib"
RENAME libaec_bf_process.so
)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/libaec_bf_process_wakeup_mode2_nn.a
DESTINATION "lib"
RENAME libaec_bf_process.a
)
else()
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/libaec_bf_process.so
${CMAKE_CURRENT_SOURCE_DIR}/uclibc/libaec_bf_process.a
DESTINATION "lib"
)
endif()
endif()
if (RKAPPLUS_AED_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/librkaudio_detect.so
${CMAKE_CURRENT_SOURCE_DIR}/uclibc/librkaudio_detect.a
DESTINATION "lib"
)
endif()
if (RKAPPLUS_EFFECT_EQDRC_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/uclibc/librkaudio_effect_eqdrc.so
${CMAKE_CURRENT_SOURCE_DIR}/uclibc/librkaudio_effect_eqdrc.a
DESTINATION "lib"
)
install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/conf/
DESTINATION "vqefiles"
FILES_MATCHING PATTERN "*.bin"
)
endif()
endif() endif()
if (RKAPPLUS_AED_ENABLE) if (USE_32BIT)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/librkaudio_detect.so if(RKAPPLUS_AED_ENABLE OR RKAPPLUS_ENABLE)
${CMAKE_CURRENT_SOURCE_DIR}/librkaudio_detect.a install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib32/librkaudio_common.so
DESTINATION "lib" ${CMAKE_CURRENT_SOURCE_DIR}/lib32/librkaudio_common.a
) DESTINATION "lib"
)
endif()
if(RKAPPLUS_ENABLE)
if (RKAPPLUS_WAKEUP_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib32/libaec_bf_process_wakeup.so
DESTINATION "lib"
RENAME libaec_bf_process.so
)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib32/libaec_bf_process_wakeup.a
DESTINATION "lib"
RENAME libaec_bf_process.a
)
else()
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib32/libaec_bf_process.so
${CMAKE_CURRENT_SOURCE_DIR}/lib32/libaec_bf_process.a
DESTINATION "lib"
)
endif()
endif()
if (RKAPPLUS_AED_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib32/librkaudio_detect.so
${CMAKE_CURRENT_SOURCE_DIR}/lib32/librkaudio_detect.a
DESTINATION "lib"
)
endif()
if (RKAPPLUS_EFFECT_EQDRC_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib32/librkaudio_effect_eqdrc.so
${CMAKE_CURRENT_SOURCE_DIR}/lib32/librkaudio_effect_eqdrc.a
DESTINATION "lib"
)
install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/conf/
DESTINATION "vqefiles"
FILES_MATCHING PATTERN "*.bin"
)
endif()
endif() endif()
if (RKAPPLUS_ENABLE OR RKAPPLUS_AED_ENABLE) if (USE_64BIT)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/librkaudio_common.so if(RKAPPLUS_AED_ENABLE OR RKAPPLUS_ENABLE)
${CMAKE_CURRENT_SOURCE_DIR}/librkaudio_common.a install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib64/librkaudio_common.so
DESTINATION "lib" ${CMAKE_CURRENT_SOURCE_DIR}/lib64/librkaudio_common.a
) DESTINATION "lib"
endif() )
endif()
if(RKAPPLUS_ENABLE)
if (RKAPPLUS_WAKEUP_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib32/libaec_bf_process_wakeup.so
DESTINATION "lib"
RENAME libaec_bf_process.so
)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib64/libaec_bf_process_wakeup.a
DESTINATION "lib"
RENAME libaec_bf_process.a
)
else()
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib64/libaec_bf_process.so
${CMAKE_CURRENT_SOURCE_DIR}/lib64/libaec_bf_process.a
DESTINATION "lib"
)
endif()
endif()
if (RKAPPLUS_AED_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib64/librkaudio_detect.so
${CMAKE_CURRENT_SOURCE_DIR}/lib64/librkaudio_detect.a
DESTINATION "lib"
)
endif()
if (RKAPPLUS_EFFECT_EQDRC_ENABLE)
install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/lib64/librkaudio_effect_eqdrc.so
${CMAKE_CURRENT_SOURCE_DIR}/lib64/librkaudio_effect_eqdrc.a
DESTINATION "lib"
)
install(DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/conf/
DESTINATION "vqefiles"
FILES_MATCHING PATTERN "*.bin"
)
endif()
endif()

View File

@@ -0,0 +1,163 @@
{
"skv_configs": {
"aec": {
"status" : "enable",
"drop_ref_channel" : 0,
"model_aec_en" : "disable",
"delay_len": 0,
"look_ahead": 0,
"filter_len": 2
},
"delay": {
"status" : "disable",
"MaxFrame" : 32,
"LeastDelay" : 0,
"JumpFrame" : 12,
"DelayOffset" : 1,
"MicAmpThr" : 50,
"RefAmpThr" : 50,
"StartFreq" : 700,
"EndFreq" : 4000,
"SmoothFactor": "float:0.97"
},
"bf": {
"status" : "enable",
"targ": 4,
"drop_ref_channel" : 0
},
"fast_aec": {
"status" : "enable"
},
"aes": {
"status": "enable",
"beta_up": "float:0.002",
"beta_down": "float:0.001",
"beta_up_low": "float:0.005",
"beta_down_low": "float:0.001",
"low_freq": 500,
"high_freq": 3750,
"thd_flag": 0,
"hard_flag": 0,
"LimitRatio_0": "2.0 1.5 1.0",
"LimitRatio_1": "1.5 1.2 1.0",
"ThdSplitFreq_0": "0 0",
"ThdSplitFreq_1": "1500 2000",
"ThdSplitFreq_2": "2000 6000",
"ThdSplitFreq_3": "6000 8000",
"ThdSupDegree_0": "0.01 0.01 0.005 0.005 0 0 0 0 0 0",
"ThdSupDegree_1": "0.0005 0.0005 0.0005 0 0 0 0 0 0 0",
"ThdSupDegree_2": "0.001 0.001 0.001 0.001 0.001 0.001 0 0 0 0",
"ThdSupDegree_3": "0.001 0.001 0.001 0.001 0.001 0.001 0 0 0 0",
"HardSplitFreq_0": "100 500",
"HardSplitFreq_1": "0 0",
"HardSplitFreq_2": "0 0",
"HardSplitFreq_3": "0 0",
"HardSplitFreq_4": "500 3000",
"HardThreshold": "0.35 0.15 0.35 0.15"
},
"eq": {
"status" : "disable",
"para_len" : 65,
"filter_bank_0": "-1 -1 -1 0 0 -3 -4 -1 1 -3 -7 -5 -2",
"filter_bank_1": "-3 -6 -6 -6 -9 -8 -2 -2 -12 -15 -2 8 -3",
"filter_bank_2": "-16 -8 7 6 -4 -3 -1 -9 -7 14 11 -33 -52",
"filter_bank_3": "-8 26 28 -99 -83 -24 -39 -104 -120 -118 -170 -175 -59",
"filter_bank_4": "-56 -344 -522 -126 337 -124 -1148 -943 891 1605 -1675 -7407 32764"
},
"gsc": {
"status" : "disable",
"method" : 0
},
"agc": {
"status" : "enable",
"attack_time" : "float:200.0",
"release_time" : "float:200.0",
"attenuate_time" : "float:1000",
"max_gain": "float:25.0",
"max_peak": "float:-1.0",
"fRth0": "float:-55",
"fRth1": "float:-45",
"fRth2": "float:-30",
"fRk0" : "float:2.0",
"fRk1" : "float:0.8",
"fRk2" : "float:0.4",
"fLineGainDb" : "float:-25.0",
"swSmL0" : 40,
"swSmL1" : 80,
"swSmL2" : 80
},
"anr": {
"status" : "enable",
"noiseFactor": "float:0.88",
"swU": 1,
"psiMin": "float:0.02",
"psiMax": "float:0.516",
"fGmin": "float:0.01",
"Sup_Freq1": -3588,
"Sup_Freq2": -3588,
"Sup_Energy1" : "float:10000.0",
"Sup_Energy2" : "float:10000.0",
"InterV" : 1,
"BiasMin" : "float:1.67",
"UpdateFrm":15,
"NPreGammaThr" : "float:4.6",
"NPreZetaThr" : "float:1.67",
"SabsGammaThr0" : "float:1.0",
"SabsGammaThr1" : "float:3.0",
"InfSmooth" : "float:0.8",
"ProbSmooth" : "float:0.7",
"CompCoeff" : "float:1.4",
"PrioriMin" : "float:0.0316",
"PostMax" : "float:40.0",
"PrioriRatio":"float:0.95",
"PrioriRatioLow" : "float:0.95",
"SplitBand" : 20,
"PrioriSmooth": "float:0.7",
"TranMode": 0
},
"nlp": {
"status" : "disable",
"band_pass_thd" : "10,10,10,5,5,5,0,0",
"super_est_factor" : "6,10,10,10,10,10,6,6"
},
"dereverb": {
"status" : "enable",
"rlsLg" :4,
"curveLg": 20,
"delay" : 2,
"forgetting" : "float:0.98",
"t60": "float:0.4",
"coCoeff" : "float:1"
},
"cng": {
"status" : "disable",
"fGain": "float:2.0",
"fMpy": "float:5",
"fSmoothAlpha": "float:0.99",
"fSpeechGain": "float:0.01"
},
"dtd": {
"status" : "disable",
"ksiThd_high": "float:0.7",
"ksiThd_low": "float:0.5"
},
"howl": {
"status" : "enable",
"mode" : 4
},
"doa": {
"status" : "disable",
"rad" : "float:0.0585",
"start_freq" : 800,
"end_freq" : 1600,
"lg_num" : 40,
"lg_pitch_num" : 1
},
"wind": {
"status": "disable"
},
"ainr": {
"status": "disable"
}
}
}

View File

@@ -0,0 +1,161 @@
{
"skv_configs": {
"aec": {
"status" : "enable",
"drop_ref_channel" : 0,
"model_aec_en" : "disable",
"delay_len": 0,
"look_ahead": 0,
"filter_len": 2
},
"delay": {
"status" : "disable",
"MaxFrame" : 32,
"LeastDelay" : 0,
"JumpFrame" : 12,
"DelayOffset" : 1,
"MicAmpThr" : 50,
"RefAmpThr" : 50,
"StartFreq" : 700,
"EndFreq" : 4000,
"SmoothFactor": "float:0.97"
},
"bf": {
"status" : "enable",
"targ": 4,
"drop_ref_channel" : 0
},
"fast_aec": {
"status" : "disable"
},
"aes": {
"status": "enable",
"beta_up": "float:0.002",
"beta_down": "float:0.001",
"beta_up_low": "float:0.005",
"beta_down_low": "float:0.001",
"low_freq": 500,
"high_freq": 3750,
"thd_flag": 0,
"hard_flag": 0,
"LimitRatio_0": "2.0 1.5 1.0",
"LimitRatio_1": "1.5 1.2 1.0",
"ThdSplitFreq_0": "0 0",
"ThdSplitFreq_1": "1500 2000",
"ThdSplitFreq_2": "2000 6000",
"ThdSplitFreq_3": "6000 8000",
"ThdSupDegree_0": "0.01 0.01 0.005 0.005 0 0 0 0 0 0",
"ThdSupDegree_1": "0.0005 0.0005 0.0005 0 0 0 0 0 0 0",
"ThdSupDegree_2": "0.001 0.001 0.001 0.001 0.001 0.001 0 0 0 0",
"ThdSupDegree_3": "0.001 0.001 0.001 0.001 0.001 0.001 0 0 0 0",
"HardSplitFreq_0": "100 500",
"HardSplitFreq_1": "0 0",
"HardSplitFreq_2": "0 0",
"HardSplitFreq_3": "0 0",
"HardSplitFreq_4": "500 3000",
"HardThreshold": "0.35 0.15 0.35 0.15"
},
"gsc": {
"status" : "disable",
"method" : 0
},
"agc": {
"status" : "enable",
"attack_time" : "float:100.0",
"release_time" : "float:200.0",
"attenuate_time" : "float:1000",
"max_gain": "float:18.0",
"max_peak": "float:-1.0",
"fRth0": "float:-90",
"fRth1": "float:-88",
"fRth2": "float:-28",
"fRk0" : "float:2.0",
"fRk1" : "float:0.8",
"fRk2" : "float:0.4",
"fLineGainDb" : "float:-25.0",
"swSmL0" : 40,
"swSmL1" : 80,
"swSmL2" : 80
},
"anr": {
"status" : "enable",
"noiseFactor": "float:-3588.0",
"swU": 8,
"psiMin": "float:0.02",
"psiMax": "float:0.516",
"fGmin": "float:0.05",
"Sup_Freq1": -3588,
"Sup_Freq2": -3588,
"Sup_Energy1" : "float:100000.0",
"Sup_Energy2" : "float:100000.0",
"InterV" : 8,
"BiasMin" : "float:1.67",
"UpdateFrm":15,
"NPreGammaThr" : "float:4.6",
"NPreZetaThr" : "float:1.67",
"SabsGammaThr0" : "float:1.0",
"SabsGammaThr1" : "float:3.0",
"InfSmooth" : "float:0.8",
"ProbSmooth" : "float:0.7",
"CompCoeff" : "float:1.4",
"PrioriMin" : "float:0.0316",
"PostMax" : "float:40.0",
"PrioriRatio":"float:0.95",
"PrioriRatioLow" : "float:0.95",
"SplitBand" : 20,
"PrioriSmooth": "float:0.7",
"TranMode": 0
},
"nlp": {
"status" : "disable",
"band_pass_thd" : "10,10,10,5,5,5,0,0",
"super_est_factor" : "6,10,10,10,10,10,6,6"
},
"dereverb": {
"status" : "disable",
"rlsLg" :4,
"curveLg": 20,
"delay" : 2,
"forgetting" : "float:0.98",
"t60": "float:0.4",
"coCoeff" : "float:1"
},
"cng": {
"status" : "disable",
"fGain": "float:2.0",
"fMpy": "float:5",
"fSmoothAlpha": "float:0.99",
"fSpeechGain": "float:0.01"
},
"dtd": {
"status" : "disable",
"ksiThd_high": "float:0.7",
"ksiThd_low": "float:0.5"
},
"howl": {
"status" : "disable",
"mode" : 5
},
"doa": {
"status" : "disable",
"rad" : "float:0.0585",
"start_freq" : 800,
"end_freq" : 1600,
"lg_num" : 40,
"lg_pitch_num" : 1
},
"wind": {
"status": "disable"
},
"wakeup": {
"status" : "disable",
"wakeup_words" : 1,
"mode" : 2,
"model_word2" : "/oem/usr/share/vqefiles/rkaudio_model_mode2_word2.rknn"
},
"ainr": {
"status": "enable",
"model_path": "/oem/usr/share/vqefiles/rkaudio_model_ainr.rknn"
}
}
}

View File

@@ -0,0 +1,79 @@
{
"skv_configs": {
"aec": {
"status" : "enable",
"drop_ref_channel" : 0,
"delay_len" : -3588,
"look_ahead" : 0
},
"bf": {
"status" : "enable",
"targ" : 4,
"drop_ref_channel" : 0
},
"fast_aec": {
"status" : "enable"
},
"aes": {
"status" : "enable",
"beta_up": "float:-3588.0",
"beta_down": "float:0.001"
},
"gsc": {
"status" : "disable",
"method" : 0
},
"agc": {
"status" : "enable",
"attack_time" : "float:200.0",
"release_time" : "float:200.0",
"attenuate_time" : "float:1000",
"max_gain" : "float:25.0",
"max_peak" : "float:-1.0",
"fRth0": "float:-55",
"fRth1" : "float:-45",
"fRth2" : "float:-30",
"fRk0" : "float:2.0",
"fRk1" : "float:0.8",
"fRk2" : "float:0.4",
"fLineGainDb" : "float:-25.0",
"swSmL0" : 40,
"swSmL1" : 80,
"swSmL2" : 80
},
"anr": {
"status" : "enable",
"noiseFactor" : "float:-3588.0",
"swU" : 10,
"psiMin" : "float:0.01",
"psiMax" : "float:0.516",
"fGmin" : "float:0.05"
},
"nlp": {
"status" : "disable",
"band_pass_thd" : "10,10,10,5,5,5,0,0",
"super_est_factor" : "6,10,10,10,10,10,6,6"
},
"dereverb": {
"status" : "enable",
"rlsLg" :4,
"curveLg" : 30,
"delay" : 2,
"forgetting" : "float:0.98",
"t60" : "float:0.4",
"coCoeff" : "float:1"
},
"cng": {
"status" : "disable",
"fGain" : "float:20.0",
"fMpy" : "float:20",
"fSmoothAlpha" : "float:0.99",
"fSpeechGain" : "float:0.0"
},
"dtd": {
"status" : "disable",
"ksiThd_high" : "float:0.6",
"ksiThd_low" : "float:0.3"
}
}
}

View File

@@ -0,0 +1,55 @@
{
"skv_ao_configs": {
"anr": {
"status" : "enable",
"noiseFactor": "float:0.99",
"swU": 10,
"psiMin": "float:0.05",
"psiMax": "float:0.516",
"fGmin": "float:0.05",
"Sup_Freq1": -3588,
"Sup_Freq2": -3588,
"Sup_Energy1" : "float:100000.0",
"Sup_Energy2" : "float:100000.0",
"InterV" : 8,
"BiasMin" : "float:1.67",
"UpdateFrm":15,
"NPreGammaThr" : "float:4.6",
"NPreZetaThr" : "float:1.67",
"SabsGammaThr0" : "float:1.0",
"SabsGammaThr1" : "float:3.0",
"InfSmooth" : "float:0.8",
"ProbSmooth" : "float:0.7",
"CompCoeff" : "float:1.4",
"PrioriMin" : "float:0.0316",
"PostMax" : "float:40.0",
"PrioriRatio":"float:0.8",
"PrioriRatioLow" : "float:0.95",
"SplitBand" : 20,
"PrioriSmooth": "float:0.7",
"TranMode": 0
},
"agc": {
"status" : "enable",
"attack_time" : "float:200.0",
"release_time" : "float:200.0",
"attenuate_time" : "float:1000",
"max_gain": "float:5.0",
"max_peak": "float:-1.0",
"fRth0": "float:-45",
"fRth1": "float:-35",
"fRth2": "float:-25",
"fRk0" : "float:2.0",
"fRk1" : "float:0.8",
"fRk2" : "float:0.4",
"fLineGainDb" : "float:-25.0",
"swSmL0" : 40,
"swSmL1" : 80,
"swSmL2" : 80
},
"howl": {
"status" : "enable",
"mode" : 2
}
}
}

View File

@@ -0,0 +1,82 @@
{
"skv_configs": {
"aec": {
"status" : "disable",
"drop_ref_channel" : 0,
"delay_len" : 0,
"look_ahead" : 0
},
"bf": {
"status" : "enable",
"targ" : 0,
"drop_ref_channel" : 0
},
"fast_aec": {
"status" : "disable"
},
"wakeup": {
"status" : "enable"
},
"aes": {
"status" : "disable",
"beta_up": "float:0.002",
"beta_down": "float:0.001"
},
"gsc": {
"status" : "disable",
"method" : 0
},
"agc": {
"status" : "enable",
"attack_time" : "float:200.0",
"release_time" : "float:1000.0",
"attenuate_time" : "float:1000",
"max_gain" : "float:30.0",
"max_peak" : "float:-1.0",
"fRth0": "float:-95.0",
"fRth1" : "float:-90.0",
"fRth2" : "float:-35.0",
"fRk0" : "float:2.0",
"fRk1" : "float:0.8",
"fRk2" : "float:0.4",
"fLineGainDb" : "float:-35.0",
"swSmL0" : 40,
"swSmL1" : 80,
"swSmL2" : 80
},
"anr": {
"status" : "enable",
"noiseFactor" : "float:0.88",
"swU" : 10,
"psiMin" : "float:0.02",
"psiMax" : "float:0.516",
"fGmin" : "float:0.1"
},
"nlp": {
"status" : "disable",
"band_pass_thd" : "10,10,10,5,5,5,0,0",
"super_est_factor" : "6,10,10,10,10,10,6,6"
},
"dereverb": {
"status" : "disable",
"rlsLg" :4,
"curveLg" : 30,
"delay" : 2,
"forgetting" : "float:0.98",
"t60" : "float:1.5",
"coCoeff" : "float:1"
},
"cng": {
"status" : "disable",
"fGain" : "float:10.0",
"fMpy" : "float:10.0",
"fSmoothAlpha" : "float:0.99",
"fSpeechGain" : "float:0.0"
},
"dtd": {
"status" : "disable",
"ksiThd_high" : "float:0.7",
"ksiThd_low" : "float:0.5"
}
}
}

View File

@@ -0,0 +1,85 @@
{
"skv_configs": {
"aec": {
"status" : "disable",
"drop_ref_channel" : 0,
"delay_len" : 0,
"look_ahead" : 0
},
"bf": {
"status" : "enable",
"targ" : 3,
"drop_ref_channel" : 0
},
"fast_aec": {
"status" : "disable"
},
"wakeup": {
"status" : "enable",
"wakeup_words" : 7,
"mode" : 1,
"model_word1" : "/oem/usr/share/vqefiles/rkaudio_model_mode1_word1.rknn"
},
"aes": {
"status" : "disable",
"beta_up": "float:0.002",
"beta_down": "float:0.001"
},
"gsc": {
"status" : "disable",
"method" : 0
},
"agc": {
"status" : "disable",
"attack_time" : "float:200.0",
"release_time" : "float:1000.0",
"attenuate_time" : "float:1000",
"max_gain" : "float:30.0",
"max_peak" : "float:-1.0",
"fRth0": "float:-95.0",
"fRth1" : "float:-90.0",
"fRth2" : "float:-35.0",
"fRk0" : "float:2.0",
"fRk1" : "float:0.8",
"fRk2" : "float:0.4",
"fLineGainDb" : "float:-35.0",
"swSmL0" : 40,
"swSmL1" : 80,
"swSmL2" : 80
},
"anr": {
"status" : "disable",
"noiseFactor" : "float:0.88",
"swU" : 10,
"psiMin" : "float:0.02",
"psiMax" : "float:0.516",
"fGmin" : "float:0.1"
},
"nlp": {
"status" : "disable",
"band_pass_thd" : "10,10,10,5,5,5,0,0",
"super_est_factor" : "6,10,10,10,10,10,6,6"
},
"dereverb": {
"status" : "disable",
"rlsLg" :4,
"curveLg" : 30,
"delay" : 2,
"forgetting" : "float:0.98",
"t60" : "float:1.5",
"coCoeff" : "float:1"
},
"cng": {
"status" : "disable",
"fGain" : "float:10.0",
"fMpy" : "float:10.0",
"fSmoothAlpha" : "float:0.99",
"fSpeechGain" : "float:0.0"
},
"dtd": {
"status" : "disable",
"ksiThd_high" : "float:0.7",
"ksiThd_low" : "float:0.5"
}
}
}

View File

@@ -0,0 +1,86 @@
{
"skv_configs": {
"aec": {
"status" : "disable",
"drop_ref_channel" : 0,
"delay_len" : 0,
"look_ahead" : 0
},
"bf": {
"status" : "enable",
"targ" : 3,
"drop_ref_channel" : 0
},
"fast_aec": {
"status" : "disable"
},
"wakeup": {
"status" : "enable",
"wakeup_words" : 1,
"mode" : 2,
"model_word1" : "/oem/usr/share/vqefiles/rkaudio_model_mode2_word1.rknn",
"model_word2" : "/oem/usr/share/vqefiles/rkaudio_model_mode2_word2.rknn"
},
"aes": {
"status" : "disable",
"beta_up": "float:0.002",
"beta_down": "float:0.001"
},
"gsc": {
"status" : "disable",
"method" : 0
},
"agc": {
"status" : "disable",
"attack_time" : "float:200.0",
"release_time" : "float:1000.0",
"attenuate_time" : "float:1000",
"max_gain" : "float:30.0",
"max_peak" : "float:-1.0",
"fRth0": "float:-95.0",
"fRth1" : "float:-90.0",
"fRth2" : "float:-35.0",
"fRk0" : "float:2.0",
"fRk1" : "float:0.8",
"fRk2" : "float:0.4",
"fLineGainDb" : "float:-35.0",
"swSmL0" : 40,
"swSmL1" : 80,
"swSmL2" : 80
},
"anr": {
"status" : "disable",
"noiseFactor" : "float:0.88",
"swU" : 10,
"psiMin" : "float:0.02",
"psiMax" : "float:0.516",
"fGmin" : "float:0.1"
},
"nlp": {
"status" : "disable",
"band_pass_thd" : "10,10,10,5,5,5,0,0",
"super_est_factor" : "6,10,10,10,10,10,6,6"
},
"dereverb": {
"status" : "disable",
"rlsLg" :4,
"curveLg" : 30,
"delay" : 2,
"forgetting" : "float:0.98",
"t60" : "float:1.5",
"coCoeff" : "float:1"
},
"cng": {
"status" : "disable",
"fGain" : "float:10.0",
"fMpy" : "float:10.0",
"fSmoothAlpha" : "float:0.99",
"fSpeechGain" : "float:0.0"
},
"dtd": {
"status" : "disable",
"ksiThd_high" : "float:0.7",
"ksiThd_low" : "float:0.5"
}
}
}

View File

@@ -1,79 +0,0 @@
{
"skv_configs": {
"aec": {
"status" : "enable",
"drop_ref_channel" : 0,
"delay_len" : 0,
"look_ahead" : 0
},
"bf": {
"status" : "enable",
"targ" : 4,
"drop_ref_channel" : 0
},
"fast_aec": {
"status" : "enable"
},
"aes": {
"status" : "enable",
"beta_up": "float:0.005",
"beta_down": "float:0.001"
},
"gsc": {
"status" : "disable",
"method" : 0
},
"agc": {
"status" : "enable",
"attack_time" : "float:200.0",
"release_time" : "float:200.0",
"attenuate_time" : "float:1000",
"max_gain" : "float:20.0",
"max_peak" : "float:-1.0",
"fRth0": "float:-45",
"fRth1" : "float:-40",
"fRth2" : "float:-35",
"fRk0" : "float:2.0",
"fRk1" : "float:0.8",
"fRk2" : "float:0.4",
"fLineGainDb" : "float:-25.0",
"swSmL0" : 40,
"swSmL1" : 80,
"swSmL2" : 80
},
"anr": {
"status" : "enable",
"noiseFactor" : "float:0.88",
"swU" : 10,
"psiMin" : "float:0.01",
"psiMax" : "float:0.516",
"fGmin" : "float:0.05"
},
"nlp": {
"status" : "disable",
"band_pass_thd" : "10,10,10,5,5,5,0,0",
"super_est_factor" : "6,10,10,10,10,10,6,6"
},
"dereverb": {
"status" : "disable",
"rlsLg" :4,
"curveLg" : 10,
"delay" : "2",
"forgetting" : "float:0.98",
"t60" : "float:0.5",
"coCoeff" : "float:1"
},
"cng": {
"status" : "disable",
"fGain" : "float:20.0",
"fMpy" : "float:20",
"fSmoothAlpha" : "float:0.99",
"fSpeechGain" : "float:0.0"
},
"dtd": {
"status" : "disable",
"ksiThd_high" : "float:0.6",
"ksiThd_low" : "float:0.3"
}
}
}

View File

@@ -0,0 +1,181 @@
#ifdef _MSC_VER
#include "audio/wave_reader.h"
#include "audio/wave_writer.h"
#include "skv/rkaudio_sed.h"
#else
#include "wave_reader.h"
#include "wave_writer.h"
#include "rkaudio_sed.h"
#endif
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <string>
#include <string.h>
#include <assert.h>
#include <iostream>
#define IN_SIZE 256
double clk_sed;
static clock_t clk_start, clk_end;
int g_mem_cost;
typedef unsigned char SKV_BYTE;
int main(int argc, char* argv[])
{
int ret = 0;
double Total_time_s = 0.0, Total_time_e = 0.0;
double Total_sample = 0.0;
double Tmp_sample = 0.0;
clock_t startTime, endTime;
#if 1
argc = 3;
#endif
/* For Debug */
int aed_stat = 0;
int out_size = 0, in_size = 0, res = 0;
int bcd_stat = 0, bcd_stat_old = 0;
if (argc < 3)
{
printf("Error: Wrong input parameters! A example is as following: \n");
printf("fosafer_enh test_in.wav test_out.wav 0\n");
exit(-1);
}
char* in_filename = argv[1];
char* out_filename = argv[2];
//putenv("PATH_ORI_IN=/userdata/ori_in.wav");
//putenv("PATH_NET_IN=/userdata/net_in.wav");
//putenv("PATH_NET_OUT=/userdata/net_out.wav");
// for wave reader
wave_reader* wr;
wave_reader_error rerror;
// for wave writer
wave_writer* ww;
wave_writer_error werror;
wave_writer_format format;
// <20><>ȡ<EFBFBD><C8A1><EFBFBD><EFBFBD><EFBFBD><EFBFBD>Ƶ
wr = wave_reader_open(in_filename, &rerror);
if (!wr) {
printf("rerror=%d\n", rerror);
return -1;
}
int mSampleRate = wave_reader_get_sample_rate(wr);
int mBitPerSample = wave_reader_get_sample_bits(wr);
int mNumChannel = wave_reader_get_num_channels(wr);
// <20><><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>
if (mNumChannel > 1) {
printf("This algorithm is a single channel algorithm and will run on the first channel of data\n");
}
int read_size = IN_SIZE * mNumChannel * mBitPerSample / 8;
SKV_BYTE* in = (SKV_BYTE*)malloc(read_size * sizeof(SKV_BYTE));
int out_res_num = 5;
short* out = (short*)malloc(IN_SIZE * out_res_num * sizeof(short));
format.num_channels =5;
format.sample_rate = mSampleRate;
format.sample_bits = mBitPerSample;
ww = wave_writer_open(out_filename, &format, &werror);
if (!ww)
{
printf("werror=%d\n", werror);
wave_reader_close(wr);
return -1;
}
RKAudioSedRes sed_res;
RKAudioSedParam* sed_param = rkaudio_sed_param_init();
void* st_sed = rkaudio_sed_init(mSampleRate, mBitPerSample, mNumChannel, sed_param);
char initres = rkaudio_sed_init_res(st_sed);
rkaudio_sed_param_destroy(sed_param);
if (st_sed == NULL) {
printf("Failed to create baby cry handle\n");
return -1;
}
startTime = clock();
int cnt = 0;
int gbs = 1, buz = 1, bcd = 1;
int bcdcounts = 0;
while (0 < (res = wave_reader_get_samples(wr, IN_SIZE, in )))
{
in_size = res * (mBitPerSample / 8) * mNumChannel;
cnt++;
clk_start = clock();
out_size = rkaudio_sed_process(st_sed, (short*)in, in_size / 2, &sed_res);
float lsd_res = rkaudio_sed_lsd_db(st_sed);
if (out_size < 0)
fprintf(stderr, "bcd process return error=%d\n", out_size);
clk_end = clock();
clk_sed += clk_end - clk_start;
//printf("lsd=%d,snr=%d,bcd=%d,buz_res=%d,gbs_res=%d\n", sed_res.lsd_res, sed_res.snr_res, sed_res.bcd_res, sed_res.buz_res, sed_res.gbs_res);
if (sed_res.bcd_res == 1)
bcdcounts++;
//printf of lmf
/*
if (gbs && sed_res.gbs_res)
{
printf("\nLOG(rkaudio_sed_process)\n");
printf("bcd=%d,buz_res=%d,gbs_res=%d\n", sed_res.bcd_res, sed_res.buz_res, sed_res.gbs_res);
printf("detect glass broken sound: %.2fs\n", Total_sample / mSampleRate);
gbs = 0;
}
if (buz && sed_res.buz_res)
{
printf("\nLOG(rkaudio_sed_process)\n");
printf("bcd=%d,buz_res=%d,gbs_res=%d\n", sed_res.bcd_res, sed_res.buz_res, sed_res.gbs_res);
printf("detect buzzer sound: %.2fs\n", Total_sample / mSampleRate);
buz = 0;
}
if (bcd && sed_res.bcd_res)
{
printf("\nLOG(rkaudio_sed_process)\n");
printf("bcd=%d,buz_res=%d,gbs_res=%d\n", sed_res.bcd_res, sed_res.buz_res, sed_res.gbs_res);
printf("detect baby crying: %.2fs\n", Total_sample / mSampleRate);
bcd = 0;
}
*/
// <20><><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>
for (int i = 0; i < IN_SIZE; i++) {
*(out + out_res_num * i) = 10000 * sed_res.snr_res;
*(out + out_res_num * i + 1) = 10000 * sed_res.lsd_res;
*(out + out_res_num * i + 2) = 10000 * sed_res.bcd_res;
*(out + out_res_num * i + 3) = 10000 * sed_res.buz_res;
*(out + out_res_num * i + 4) = 10000 * sed_res.gbs_res;
}
wave_writer_put_samples(ww, IN_SIZE, out);
Total_sample += in_size / 2 / mNumChannel;
//if (cnt % 63 == 0)
// printf("cnt = %d\n", cnt);
}
endTime = clock();
printf("\nFinished, speech_time = %f, cost_time = %f\n", \
Total_sample / mSampleRate, (double)(endTime - startTime) / CLOCKS_PER_SEC);
printf("sed = %f\n", clk_sed / CLOCKS_PER_SEC);
wave_writer_close(ww, &werror);
wave_reader_close(wr);
free(in);
free(out);
if (st_sed)
rkaudio_sed_destroy(st_sed);
return 0;
}

View File

@@ -0,0 +1,12 @@
执行命令:
./test_sed 输入音频文件地址 输出音频文件地址
例如: ./test_sed ./dbg_capin.wav ./dbg_capin_out.wav
带有outfile的lib 在设定环境变量PATH_ORI_IN PATH_NET_IN PATH_NET_OUT 后
将在对应位置生成ori_in.wav net_in.wav net_out.wav用于分析样例
例如export PATH_ORI_IN=/USERDATA/ori_in.wav
md5sum
814f25624fd064613b136b35d8db211a librkaudio_detect.so
99d6bce7d46fd2c6e2fe298f43ef4b02 librkaudio_detect.a

View File

@@ -0,0 +1,32 @@
#ifndef WAVE_READER_H
#define WAVE_READER_H
#ifdef __cplusplus
extern "C" {
#endif
typedef enum {
WR_NO_ERROR = 0,
WR_OPEN_ERROR,
WR_IO_ERROR,
WR_ALLOC_ERROR,
WR_BAD_CONTENT,
} wave_reader_error;
typedef struct wave_reader wave_reader;
wave_reader *wave_reader_open(char *filename, wave_reader_error *error);
void wave_reader_close(wave_reader *wr);
int wave_reader_get_format(wave_reader *wr);
int wave_reader_get_num_channels(wave_reader *wr);
int wave_reader_get_sample_rate(wave_reader *wr);
int wave_reader_get_sample_bits(wave_reader *wr);
int wave_reader_get_num_samples(wave_reader *wr);
int wave_reader_get_samples(wave_reader *wr, int n, void *buf);
#ifdef __cplusplus
}
#endif
#endif//WAVE_READER_H

View File

@@ -0,0 +1,37 @@
#ifndef WAVE_WRITER_H
#define WAVE_WRITER_H
#ifdef __cplusplus
extern "C" {
#endif
typedef enum {
WW_NO_ERROR = 0,
WW_OPEN_ERROR,
WW_IO_ERROR,
WW_ALLOC_ERROR,
WW_BAD_FORMAT,
} wave_writer_error;
typedef struct {
int num_channels;
int sample_rate;
int sample_bits;
} wave_writer_format;
typedef struct wave_writer wave_writer;
struct wave_writer * wave_writer_open(char *filename, wave_writer_format *format, wave_writer_error *error);
int wave_writer_close(struct wave_writer *ww, wave_writer_error *error);
int wave_writer_get_format(wave_writer *ww);
int wave_writer_get_num_channels(wave_writer *ww);
int wave_writer_get_sample_rate(wave_writer *ww);
int wave_writer_get_sample_bits(wave_writer *ww);
int wave_writer_get_num_samples(wave_writer *ww);
int wave_writer_put_samples(wave_writer *ww, int n, void *buf);
#ifdef __cplusplus
}
#endif
#endif//WAVE_WRITER_H

View File

@@ -0,0 +1,679 @@
#ifndef _RKAUDIO_PREPROCESS_H_
#define _RKAUDIO_PREPROCESS_H_
#include <stdio.h>
#include <stdlib.h>
#ifdef __cplusplus
extern "C" {
#endif
#define NUM_CHANNEL 8
#define NUM_REF_CHANNEL 1
#define NUM_DROP_CHANNEL 0
#define REF_POSITION 1
//static short Array[NUM_SRC_CHANNEL] = { 9,7,5,3,2,4,6,8 }
//static short Array[NUM_CHANNEL] = {2, 3, 0, 1}; //src first, ref second
//static short Array[NUM_CHANNEL] = { 0, 1, 3, 2 };// , 4, 5 };
//static short Array[NUM_CHANNEL] = { 10, 3, 11, 2, 4, 13, 5, 12, 0, 1, 6, 7, 8, 9, 14, 15};
static short Array[NUM_CHANNEL] = { 6, 7, 2, 3, 4, 5, 0, 1};
//static short Array[NUM_CHANNEL] = { 4, 13, 5, 12, 10, 3, 11, 2, 0, 1, 6, 7, 8, 9 ,14, 15};//8mic + 8ref,drop last 6ref
//Array[NUM_SRC_CHANNEL] = { 2,3,4,5,6,7};
/**********************EQ Parameter**********************/
static short EqPara_16k[5][13] =
{
// filter_bank_0
{-1 ,-2 ,-1 ,0 ,0 ,-3 ,-4 ,-1 ,0 ,-3 ,-6 ,-4 ,-1 },
// filter_bank_1
{-1 ,-4 ,-5 ,-6 ,-10 ,-9 ,-3 ,-3 ,-11 ,-12 ,2 ,12 ,0 },
// filter_bank_2
{-16 ,-11 ,2 ,-1 ,-11 ,-8 ,-4 ,-10 ,-6 ,13 ,7 ,-40 ,-62 },
// filter_bank_3
{-17 ,23 ,-22 ,-83 ,-57 ,6 ,-14 ,-92 ,-126 ,-141 ,-200 ,-197 ,-54 },
// filter_bank_4
{-8 ,-249 ,-390 ,10 ,428 ,-142 ,-1341 ,-1365 ,208 ,664 ,-2836 ,-8715 ,32764 },
};
/**********************AES Parameter**********************/
static float LimitRatio[2][3] = {
/* low freq median freq high freq*/
{ 1.2f, 1.2f, 1.2f }, //limit
{ 1.2f, 1.2f, 1.2f }, //ratio
};
/**********************THD Parameter**********************/
static short ThdSplitFreq[4][2] = {
{ 500,1000},//For low frequency
{ 1000,2400},
{ 2400,4000},
{ 4000,8000},
};
static float ThdSupDegree[4][10] =
{
/* 2th 3th 4th 5th 6th 7th 8th 9th 10th 11th order */
/*
{0.005f, 0.005f, 0, 0, 0, 0, 0, 0, 0, 0},
{ 0.005f, 0.005f, 0.005f, 0, 0, 0, 0, 0, 0, 0},
{ 0.005f, 0.005f, 0.005f, 0.005f, 0, 0, 0,0, 0, 0,},
{ 0.003f, 0.003f, 0.004f, 0.005f, 0.003f, 0.003f, 0.003f, 0, 0, 0},
*/
/* 2th 3th 4th 5th 6th 7th 8th 9th 10th 11th order */
{ 0.005f, 0.005f, 0, 0, 0, 0, 0, 0, 0, 0},
{ 0.005f, 0.005f, 0.005f, 0, 0, 0, 0, 0, 0, 0},
{ 0.005f, 0.005f, 0.005f, 0.005f, 0, 0, 0,0, 0, 0,},
{ 0.003f, 0.003f, 0.002f, 0.002f, 0.002f, 0.002f, 0.002f, 0, 0, 0},
};
static short HardSplitFreq[5][2] = {
{ 500,2500}, //1 to 4 is select hard suppress freq bin
{ 0,0},
{ 0,0},
{ 0,0},
{ 2000,4000},//freq use to calculate mean_G
};
static float HardThreshold[4] = { 0.35,0.15, 0.25, 0.15 };
/*************************************************/
/*The Main Enable which used to control the AEC,BF and RX*/
typedef enum RKAUDIOEnable_
{
RKAUDIO_EN_AEC = 1 << 0,
RKAUDIO_EN_BF = 1 << 1,
RKAUDIO_EN_RX = 1 << 2,
RKAUDIO_EN_CMD = 1 << 3,
} RKAUDIOEnable;
/* The Sub-Enable which used to control the AEC,BF and RX*/
typedef enum RKAecEnable_
{
EN_DELAY = 1 << 0,
EN_ARRAY_RESET = 1 << 1,
} RKAecEnable;
typedef enum RKPreprocessEnable_
{
EN_Fastaec = 1 << 0,
EN_Wakeup = 1 << 1,
EN_Dereverberation = 1 << 2,
EN_Nlp = 1 << 3,
EN_AES = 1 << 4,
EN_Agc = 1 << 5,
EN_Anr = 1 << 6,
EN_GSC = 1 << 7,
GSC_Method = 1 << 8,
EN_Fix = 1 << 9,
EN_STDT = 1 << 10,
EN_CNG = 1 << 11,
EN_EQ = 1 << 12,
EN_CHN_SELECT = 1 << 13,
EN_HOWLING = 1 << 14,
EN_DOA = 1 << 15,
EN_WIND = 1 << 16,
EN_AINR = 1 << 17,
} RKPreprocessEnable;
typedef enum RkaudioRxEnable_
{
EN_RX_Anr = 1 << 0,
EN_RX_HOWLING = 1 << 1,
EN_RX_AGC = 1 << 2,
} RkaudioRxEnable;
/*****************************************/
/* Set the three Main Para which used to initialize the AEC,BF and RX*/
typedef struct SKVAECParameter_ {
int pos;
int drop_ref_channel;
int model_aec_en;
int delay_len;
int look_ahead;
short * Array_list;
//mdf
short filter_len;
//delay
void* delay_para;
} SKVAECParameter;
typedef struct SKVPreprocessParam_
{
/* Parameters of agc */
int model_bf_en;
int ref_pos;
int Targ;
int num_ref_channel;
int drop_ref_channel;
void* dereverb_para;
void* aes_para;
void* nlp_para;
void* anr_para;
void* agc_para;
void* cng_para;
void* dtd_para;
void* eq_para;
void* howl_para;
void* doa_para;
}SKVPreprocessParam;
typedef struct RkaudioRxParam_
{
/* Parameters of agc */
int model_rx_en;
void* anr_para;
void* howl_para;
void* agc_para;
}RkaudioRxParam;
/****************************************/
/*The param struct of sub-mudule of AEC,BF and RX*/
typedef struct RKAudioDelayParam_ {
short MaxFrame;
short LeastDelay;
short JumpFrame;
short DelayOffset;
short MicAmpThr;
short RefAmpThr;
short StartFreq;
short EndFreq;
float SmoothFactor;
}RKAudioDelayParam;
typedef struct SKVANRParam_ {
float noiseFactor;
int swU;
float PsiMin;
float PsiMax;
float fGmin;
short Sup_Freq1;
short Sup_Freq2;
float Sup_Energy1;
float Sup_Energy2;
short InterV;
float BiasMin;
short UpdateFrm;
float NPreGammaThr;
float NPreZetaThr;
float SabsGammaThr0;
float SabsGammaThr1;
float InfSmooth;
float ProbSmooth;
float CompCoeff;
float PrioriMin;
float PostMax;
float PrioriRatio;
float PrioriRatioLow;
int SplitBand;
float PrioriSmooth;
//transient
short TranMode;
} SKVANRParam;
typedef struct RKAudioDereverbParam_
{
int rlsLg;
int curveLg;
int delay;
float forgetting;
float T60;
float coCoeff;
} RKAudioDereverbParam;
typedef struct RKAudioAESParameter_ {
float Beta_Up;
float Beta_Down;
float Beta_Up_Low;
float Beta_Down_Low;
short low_freq;
short high_freq;
short THD_Flag;
short HARD_Flag;
float LimitRatio[2][3];
short ThdSplitFreq[4][2];
float ThdSupDegree[4][10];
short HardSplitFreq[5][2];
float HardThreshold[4];
} RKAudioAESParameter;
typedef struct RKDTDParam_
{
float ksiThd_high; /* µ¥Ë«½²ÅоöãÐÖµ */
float ksiThd_low; /* µ¥Ë«½²ÅоöãÐÖµ */
}RKDTDParam;
typedef struct SKVNLPParameter_ {
short int g_ashwAecBandNlpPara_16k[8][2];
} SKVNLPParameter;
typedef struct RKAGCParam_ {
/* аæAGC²ÎÊý */
float attack_time; /* ´¥·¢Ê±¼ä£¬¼´AGCÔöÒæÏ½µËùÐèÒªµÄʱ¼ä */
float release_time; /* Ê©·Åʱ¼ä£¬¼´AGCÔöÒæÉÏÉýËùÐèÒªµÄʱ¼ä */
float max_gain; /* ×î´óÔöÒæ£¬Í¬Ê±Ò²ÊÇÏßÐÔ¶ÎÔöÒæ£¬µ¥Î»£ºdB */
float max_peak; /* ¾­AGC´¦Àíºó£¬Êä³öÓïÒôµÄ×î´óÄÜÁ¿£¬·¶Î§£ºµ¥Î»£ºdB */
float fRth0; /* À©ÕŶνáÊøÄÜÁ¿dBãÐÖµ£¬Í¬Ê±Ò²ÊÇÏßÐԶοªÊ¼ãÐÖµ */
float fRk0; /* À©ÕŶÎбÂÊ */
float fRth1; /* ѹËõ¶ÎÆðʼÄÜÁ¿dBãÐÖµ£¬Í¬Ê±Ò²ÊÇÏßÐԶνáÊøãÐÖµ */
/* ÎÞЧ²ÎÊý */
int fs; /* Êý¾Ý²ÉÑùÂÊ */
int frmlen; /* ´¦ÀíÖ¡³¤ */
float attenuate_time; /* ÔëÉùË¥¼õʱ¼ä£¬¼´ÔëÉù¶ÎÔöÒæË¥¼õµ½1ËùÐèµÄʱ¼ä */
float fRth2; /* ѹËõ¶ÎÆðʼÄÜÁ¿dBãÐÖµ */
float fRk1; /* À©ÕŶÎбÂÊ */
float fRk2; /* À©ÕŶÎбÂÊ */
float fLineGainDb; /* ÏßÐÔ¶ÎÌáÉýdBÊý */
int swSmL0; /* À©ÕŶÎʱÓòƽ»¬µãÊý */
int swSmL1; /* ÏßÐÔ¶ÎʱÓòƽ»¬µãÊý */
int swSmL2; /* ѹËõ¶ÎʱÓòƽ»¬µãÊý */
} RKAGCParam;
typedef struct RKCNGParam_
{
/*CNG Parameter*/
float fGain; /* INT16 Q0 Ê©¼ÓÊæÊÊÔëÉù·ù¶È±ÈÀý */
float fMpy; /* INT16 Q0 °×ÔëËæ»úÊýÉú³É·ù¶È */
float fSmoothAlpha; /* ÊæÊÊÔëÉùƽ»¬ÏµÊý */
float fSpeechGain; /* ¸ù¾ÝÓïÒôÄÜÁ¿¶îÍâÊ©¼ÓÊæÊÊÔëÉù±ÈÀýÔöÒæ */
} RKCNGParam;
typedef struct RKaudioEqParam_ {
int shwParaLen; // Â˲¨Æ÷ϵÊý¸öÊý
short pfCoeff[5][13]; // Â˲¨Æ÷ϵÊý
} RKaudioEqParam;
typedef struct RKHOWLParam_
{
short howlMode;
}RKHOWLParam;
typedef struct RKDOAParam_
{
float rad;//ÏßÕó2mic¼ä¾à£¬Ô²ÕóÔòΪ°ë¾¶£»ÕóÁв»Ö§³ÖÖ¸¶¨£¬±ØÐë¸ù¾Ý¿â¶ø¶¨£¨±ÈÈç³öµÄÔ²Õó¿âÔòÖ»Ö§³ÖÔ²Õó¶¨Î»¡££©
short start_freq;
short end_freq;
short lg_num; //¸ÃÊýÖµÓ¦¸ÃΪżÊý
short lg_pitch_num; //only used for circle array, linear array must be 1, ¸©Ñö½ÇɨÃè¡£
}RKDOAParam;
/*************** TX ***************/
/* Set the Sub-Para which used to initialize the DELAY*/
inline static void* rkaudio_delay_param_init() {
/*RKAudioDelayParam* param = (RKAudioDelayParam*)malloc(sizeof(RKAudioDelayParam));*/
RKAudioDelayParam* param = (RKAudioDelayParam*)calloc(1, sizeof(RKAudioDelayParam));
param->MaxFrame = 32; /* delay×¹À¼ÆÖ¡Êý */
param->LeastDelay = 0; /* delay×î¶Ì¹À¼ÆÖ¡Êý */
param->JumpFrame = 12; /* Ìø¹ýÖ¡Êý */
param->DelayOffset = 1; /* delay offsetÖ¡Êý */
param->MicAmpThr = 50; /* mic¶Ë×îСÄÜÁ¿ãÐÖµ */
param->RefAmpThr = 50; /* ref¶Ë×îСÄÜÁ¿ãÐÖµ */
param->StartFreq = 1000; /* ÑÓʱ¹À¼ÆÆðʼƵ¶ÎµÄƵÂÊ */
param->EndFreq = 4000; /* ÑÓʱ¹À¼ÆÖÕֹƵ¶ÎµÄƵÂÊ */
param->SmoothFactor = 0.99f;
return (void*)param;
}
/* Set the Sub-Para which used to initialize the ANR*/
inline static void* rkaudio_anr_param_init_tx() {
/*SKVANRParam* param = (SKVANRParam*)malloc(sizeof(SKVANRParam));*/
SKVANRParam* param = (SKVANRParam*)calloc(1, sizeof(SKVANRParam));
/* anr parameters */
param->noiseFactor = 0.88f;//-3588.0f to compatible old json
//param->noiseFactor = -3588.0f;
param->swU = 10;
param->PsiMin = 0.02;
param->PsiMax = 0.516;
param->fGmin = 0.05;
param->Sup_Freq1 = -3588;
param->Sup_Freq2 = -3588;
param->Sup_Energy1 = 10000;
param->Sup_Energy2 = 10000;
param->InterV = 8; //ANR_NOISE_EST_V
param->BiasMin = 1.67f; //ANR_NOISE_EST_BMIN
param->UpdateFrm = 15; //UPDATE_FRAME
param->NPreGammaThr = 4.6f; //ANR_NOISE_EST_GAMMA0
param->NPreZetaThr = 1.67f; //ANR_NOISE_EST_PSI0
param->SabsGammaThr0 = 1.0f; //ANR_NOISE_EST_GAMMA2
param->SabsGammaThr1 = 3.0f; //ANR_NOISE_EST_GAMMA1
param->InfSmooth = 0.8f; //ANR_NOISE_EST_ALPHA_S
param->ProbSmooth = 0.7f; //ANR_NOISE_EST_ALPHA_D
param->CompCoeff = 1.4f; //ANR_NOISE_EST_BETA
param->PrioriMin = 0.0316f; //ANR_NOISE_EST_ESP_MIN
param->PostMax = 40.0f; //ANR_NOISE_EST_GAMMA_MAX
param->PrioriRatio = 0.95f; //ANR_NOISE_EST_ALPHA
param->PrioriRatioLow = 0.95f; //ANR_NOISE_EST_ALPHA
param->SplitBand = 20;
param->PrioriSmooth = 0.7f; //ANR_ENHANCE_BETA
//transient
param->TranMode = 0;
return (void*)param;
}
/* Set the Sub-Para which used to initialize the Dereverb*/
inline static void* rkaudio_dereverb_param_init() {
/*RKAudioDereverbParam* param = (RKAudioDereverbParam*)malloc(sizeof(RKAudioDereverbParam));*/
RKAudioDereverbParam* param = (RKAudioDereverbParam*)calloc(1, sizeof(RKAudioDereverbParam));
param->rlsLg = 4; /* RLSÂ˲¨Æ÷½×Êý */
param->curveLg = 30; /* ·Ö²¼ÇúÏß½×Êý */
param->delay = 2; /* RLSÂ˲¨Æ÷ÑÓʱ */
param->forgetting = 0.98; /* RLSÂ˲¨Æ÷ÒÅÍüÒò×Ó */
param->T60 = 0.3;//1.5; /* »ìÏìʱ¼ä¹À¼ÆÖµ£¨µ¥Î»£ºs£©£¬Ô½´ó£¬È¥»ìÏìÄÜÁ¦Ô½Ç¿£¬µ«ÊÇÔ½ÈÝÒ×¹ýÏû³ý */
param->coCoeff = 1; /* »¥Ïà¸ÉÐÔµ÷ÕûϵÊý£¬·ÀÖ¹¹ýÏû³ý£¬Ô½´óÄÜÁ¦Ô½Ç¿£¬½¨Òéȡֵ£º0.5µ½2Ö®¼ä */
return (void*)param;
}
/* Set the Sub-Para which used to initialize the AES*/
inline static void* rkaudio_aes_param_init() {
/*RKAudioAESParameter* param = (RKAudioAESParameter*)malloc(sizeof(RKAudioAESParameter));*/
RKAudioAESParameter* param = (RKAudioAESParameter*)calloc(1, sizeof(RKAudioAESParameter));
//param->Beta_Up = 0.002f; /* ÉÏÉýËÙ¶È -3588.0f to compatible old json*/
param->Beta_Up = 0.002f;
param->Beta_Down = 0.001f; /* ϽµËÙ¶È */
param->Beta_Up_Low = 0.002f; /* µÍƵÉÏÉýËÙ¶È */
param->Beta_Down_Low = 0.001f; /* µÍƵϽµËÙ¶È */
param->low_freq = 500;
param->high_freq = 3750;
param->THD_Flag = 1; /* 1 open THD, 0 close THD */
param->HARD_Flag = 1; /* 1 open Hard Suppress, 0 close Hard Suppress */
int i, j;
for (i = 0; i < 2; i++)
for (j = 0; j < 3; j++)
param->LimitRatio[i][j] = LimitRatio[i][j];
for (i = 0; i < 4; i++)
for (j = 0; j < 2; j++)
param->ThdSplitFreq[i][j] = ThdSplitFreq[i][j];
for (i = 0; i < 4; i++)
for (j = 0; j < 10; j++)
param->ThdSupDegree[i][j] = ThdSupDegree[i][j];
for (i = 0; i < 5; i++)
for (j = 0; j < 2; j++)
param->HardSplitFreq[i][j] = HardSplitFreq[i][j];
for (i = 0; i < 4; i++)
param->HardThreshold[i] = HardThreshold[i];
return (void*)param;
}
/* Set the Sub-Para which used to initialize the DTD*/
inline static void* rkaudio_dtd_param_init()
{
/*RKDTDParam* param = (RKDTDParam*)malloc(sizeof(RKDTDParam));*/
RKDTDParam* param = (RKDTDParam*)calloc(1, sizeof(RKDTDParam));
/* dtd paremeters*/
param->ksiThd_high = 0.60f; /* µ¥Ë«½²ÅоöãÐÖµ */
param->ksiThd_low = 0.50f;
return (void*)param;
}
/* Set the Sub-Para which used to initialize the AGC*/
inline static void* rkaudio_agc_param_init()
{
/*RKAGCParam* param = (RKAGCParam*)malloc(sizeof(RKAGCParam));*/
RKAGCParam* param = (RKAGCParam*)calloc(1, sizeof(RKAGCParam));
param->attack_time = 100.0; /* 触发时间即AGC增益上升所需要的时间 */
param->release_time = 200.0; /* 施放时间即AGC增益下降所需要的时间 */
//param->max_gain = 35.0; /* 最大增益同时也是线性段增益单位dB */
param->max_gain = 30; /* 最大增益同时也是线性段增益单位dB */
param->max_peak = -3.0; /* 经AGC处理后输出语音的最大能量范围单位dB */
param->fRk0 = 2; /* 扩张段斜率 */
param->fRth2 = -45; /* 压缩段起始能量dB阈值同时也是线性段结束阈值增益逐渐降低注意 fRth2 + max_gain < max_peak */
param->fRth1 = -60; /* 扩张段结束能量dB阈值同时也是线性段开始阈值能量高于改区域以max_gain增益 */
param->fRth0 = -65; /* 噪声门阈值 */
/* ÎÞЧ²ÎÊý */
param->fs = 16000; /* Êý¾Ý²ÉÑùÂÊ */
param->frmlen = 256; /* ´¦ÀíÖ¡³¤ */
param->attenuate_time = 1000; /* ÔëÉùË¥¼õʱ¼ä£¬¼´ÔëÉù¶ÎÔöÒæË¥¼õµ½1ËùÐèµÄʱ¼ä */
param->fRk1 = 0.8; /* À©ÕŶÎбÂÊ */
param->fRk2 = 0.4; /* À©ÕŶÎбÂÊ */
param->fLineGainDb = -25.0f; /* µÍÓÚ¸ÃÖµ£¬ÆðʼµÄattenuate_time(ms)ÄÚ²»×öÔöÒæ */
param->swSmL0 = 1; /* use for AINR pre_gain */
param->swSmL1 = 80; /* ÏßÐÔ¶ÎʱÓòƽ»¬µãÊý */
param->swSmL2 = 80; /* ѹËõ¶ÎʱÓòƽ»¬µãÊý */
return (void*)param;
}
/* Set the Sub-Para which used to initialize the CNG*/
inline static void* rkaudio_cng_param_init()
{
/*RKCNGParam* param = (RKCNGParam*)malloc(sizeof(RKCNGParam));*/
RKCNGParam* param = (RKCNGParam*)calloc(1, sizeof(RKCNGParam));
/* cng paremeters */
param->fSmoothAlpha = 0.99f; /* INT16 Q15 Ê©¼ÓÊæÊÊÔëÉùƽ»¬¶È */
param->fSpeechGain = 0; /* INT16 Q15 Ê©¼ÓÊæÊÊÔëÉùÓïÒôÎÆÀíÄ£Äâ³Ì¶È */
param->fGain = 10.0; /* INT16 Q0 Ê©¼ÓÊæÊÊÔëÉù·ù¶È±ÈÀý */
param->fMpy = 10; /* INT16 Q0 °×ÔëËæ»úÊýÉú³É·ù¶È */
return (void*)param;
}
/* Set the Sub-Para which used to initialize the EQ*/
inline static void* rkaudio_eq_param_init() {
/*RKaudioEqParam* param = (RKaudioEqParam*)malloc(sizeof(RKaudioEqParam));*/
RKaudioEqParam* param = (RKaudioEqParam*)calloc(1, sizeof(RKaudioEqParam));
param->shwParaLen = 65;
int i, j;
for (i = 0; i < 5; i++) {
for (j = 0; j < 13; j++) {
param->pfCoeff[i][j] = EqPara_16k[i][j];
}
}
return (void*)param;
}
/* Set the Sub-Para which used to initialize the HOWL*/
inline static void* rkaudio_howl_param_init_tx() {
/*RKHOWLParam* param = (RKHOWLParam*)malloc(sizeof(RKHOWLParam));*/
RKHOWLParam* param = (RKHOWLParam*)calloc(1, sizeof(RKHOWLParam));
param->howlMode = 5;
return (void*)param;
}
inline static void* rkaudio_doa_param_init() {
/*RKDOAParam* param = (RKDOAParam*)malloc(sizeof(RKDOAParam));*/
RKDOAParam* param = (RKDOAParam*)calloc(1, sizeof(RKDOAParam));
param->rad = 0.04f;
param->start_freq = 1000;
param->end_freq = 4000;
param->lg_num = 40;
param->lg_pitch_num = 1;
return (void*)param;
}
/************* RX *************/
inline static void* rkaudio_anr_param_init_rx() {
/*SKVANRParam* param = (SKVANRParam*)malloc(sizeof(SKVANRParam));*/
SKVANRParam* param = (SKVANRParam*)calloc(1, sizeof(SKVANRParam));
/* anr parameters */
param->noiseFactor = 0.88f;
param->swU = 10;
param->PsiMin = 0.02;
param->PsiMax = 0.516;
param->fGmin = 0.05;
param->Sup_Freq1 = -3588;
param->Sup_Freq2 = -3588;
param->Sup_Energy1 = 100000;
param->Sup_Energy2 = 100000;
param->InterV = 8; //ANR_NOISE_EST_V
param->BiasMin = 1.67f; //ANR_NOISE_EST_BMIN
param->UpdateFrm = 15; //UPDATE_FRAME
param->NPreGammaThr = 4.6f; //ANR_NOISE_EST_GAMMA0
param->NPreZetaThr = 1.67f; //ANR_NOISE_EST_PSI0
param->SabsGammaThr0 = 1.0f; //ANR_NOISE_EST_GAMMA2
param->SabsGammaThr1 = 3.0f; //ANR_NOISE_EST_GAMMA1
param->InfSmooth = 0.8f; //ANR_NOISE_EST_ALPHA_S
param->ProbSmooth = 0.7f; //ANR_NOISE_EST_ALPHA_D
param->CompCoeff = 1.4f; //ANR_NOISE_EST_BETA
param->PrioriMin = 0.0316f; //ANR_NOISE_EST_ESP_MIN
param->PostMax = 40.0f; //ANR_NOISE_EST_GAMMA_MAX
param->PrioriRatio = 0.95f; //ANR_NOISE_EST_ALPHA
param->PrioriRatioLow = 0.95f; //ANR_NOISE_EST_ALPHA
param->SplitBand = 20;
param->PrioriSmooth = 0.7f; //ANR_ENHANCE_BETA
//transient
param->TranMode = 0;
return (void*)param;
}
inline static void* rkaudio_howl_param_init_rx() {
/*RKHOWLParam* param = (RKHOWLParam*)malloc(sizeof(RKHOWLParam));*/
RKHOWLParam* param = (RKHOWLParam*)calloc(1, sizeof(RKHOWLParam));
param->howlMode = 4;
return (void*)param;
}
inline static void* rkaudio_agc_param_init_rx()
{
RKAGCParam* param = (RKAGCParam*)malloc(sizeof(RKAGCParam));
/* аæAGC²ÎÊý */
param->attack_time = 200.0; /* ´¥·¢Ê±¼ä£¬¼´AGCÔöÒæÉÏÉýËùÐèÒªµÄʱ¼ä */
param->release_time = 200.0; /* Ê©·Åʱ¼ä£¬¼´AGCÔöÒæÏ½µËùÐèÒªµÄʱ¼ä */
//param->max_gain = 35.0; /* ×î´óÔöÒæ£¬Í¬Ê±Ò²ÊÇÏßÐÔ¶ÎÔöÒæ£¬µ¥Î»£ºdB */
param->max_gain = 5.0; /* ×î´óÔöÒæ£¬Í¬Ê±Ò²ÊÇÏßÐÔ¶ÎÔöÒæ£¬µ¥Î»£ºdB */
param->max_peak = -1; /* ¾­AGC´¦Àíºó£¬Êä³öÓïÒôµÄ×î´óÄÜÁ¿£¬·¶Î§£ºµ¥Î»£ºdB */
param->fRk0 = 2; /* À©ÕŶÎбÂÊ */
param->fRth2 = -25; /* ѹËõ¶ÎÆðʼÄÜÁ¿dBãÐÖµ£¬Í¬Ê±Ò²ÊÇÏßÐԶνáÊøãÐÖµ£¬ÔöÒæÖð½¥½µµÍ£¬×¢Òâ fRth2 + max_gain < max_peak */
param->fRth1 = -35; /* À©ÕŶνáÊøÄÜÁ¿dBãÐÖµ£¬Í¬Ê±Ò²ÊÇÏßÐԶοªÊ¼ãÐÖµ£¬ÄÜÁ¿¸ßÓÚ¸ÄÇøÓòÒÔmax_gainÔöÒæ */
param->fRth0 = -45; /* ÔëÉùÃÅãÐÖµ */
/* ÎÞЧ²ÎÊý */
param->fs = 16000; /* Êý¾Ý²ÉÑùÂÊ */
param->frmlen = 256; /* ´¦ÀíÖ¡³¤ */
param->attenuate_time = 1000; /* ÔëÉùË¥¼õʱ¼ä£¬¼´ÔëÉù¶ÎÔöÒæË¥¼õµ½1ËùÐèµÄʱ¼ä */
param->fRk1 = 0.8; /* À©ÕŶÎбÂÊ */
param->fRk2 = 0.4; /* À©ÕŶÎбÂÊ */
param->fLineGainDb = -25.0f; /* µÍÓÚ¸ÃÖµ£¬ÆðʼµÄattenuate_time(ms)ÄÚ²»×öÔöÒæ */
param->swSmL0 = 40; /* À©ÕŶÎʱÓòƽ»¬µãÊý */
param->swSmL1 = 80; /* ÏßÐÔ¶ÎʱÓòƽ»¬µãÊý */
param->swSmL2 = 80; /* ѹËõ¶ÎʱÓòƽ»¬µãÊý */
return (void*)param;
}
/* Set the Sub-Para which used to initialize the AEC*/
inline static void* rkaudio_aec_param_init()
{
/*SKVAECParameter* param = (SKVAECParameter*)malloc(sizeof(SKVAECParameter));*/
SKVAECParameter* param = (SKVAECParameter*)calloc(1, sizeof(SKVAECParameter));
param->pos = REF_POSITION;
param->drop_ref_channel = NUM_DROP_CHANNEL;
param->model_aec_en = 0; //param->model_aec_en = EN_DELAY;
param->delay_len = 0; //-3588 to compatible old json
//param->delay_len = -3588;
param->look_ahead = 0;
param->Array_list = Array;
//mdf
param->filter_len = 2;
//delay
param->delay_para = rkaudio_delay_param_init();
return (void*)param;
}
/* Set the Sub-Para which used to initialize the BF*/
inline static void* rkaudio_preprocess_param_init()
{
/*SKVPreprocessParam* param = (SKVPreprocessParam*)malloc(sizeof(SKVPreprocessParam));*/
SKVPreprocessParam* param = (SKVPreprocessParam*)calloc(1, sizeof(SKVPreprocessParam));
//param->model_bf_en = EN_Fastaec;
//param->model_bf_en = EN_AINR | EN_Anr | EN_Agc; // | EN_Wakeup | EN_WIND | EN_Dereverberation | EN_HOWLING;
//param->model_bf_en = EN_Fastaec | EN_AES | EN_Dereverberation | EN_Agc;
//param->model_bf_en = EN_Fix | EN_Agc | EN_AINR | EN_Dereverberation;
param->model_bf_en = EN_DOA | EN_Fix | EN_AINR | EN_Anr ;
//param->model_bf_en = EN_Wakeup;
//param->model_bf_en = EN_Fastaec | EN_Fix | EN_Agc | EN_Anr;
//param->model_bf_en = EN_Fastaec | EN_AES | EN_Agc | EN_Fix | EN_Anr | EN_HOWLING;
//param->model_bf_en = EN_Fastaec | EN_AES | EN_Anr | EN_Agc;
param->Targ = 2;
param->ref_pos = REF_POSITION;
param->num_ref_channel = NUM_REF_CHANNEL;
param->drop_ref_channel = NUM_DROP_CHANNEL;
param->anr_para = rkaudio_anr_param_init_tx();
param->dereverb_para = rkaudio_dereverb_param_init();
param->aes_para = rkaudio_aes_param_init();
param->dtd_para = rkaudio_dtd_param_init();
param->agc_para = rkaudio_agc_param_init();
param->cng_para = rkaudio_cng_param_init();
param->eq_para = rkaudio_eq_param_init();
param->howl_para = rkaudio_howl_param_init_tx();
param->doa_para = rkaudio_doa_param_init();
return (void*)param;
}
/* Set the Sub-Para which used to initialize the RX*/
inline static void* rkaudio_rx_param_init()
{
//RkaudioRxParam* param = (RkaudioRxParam*)malloc(sizeof(RkaudioRxParam));
RkaudioRxParam* param = (RkaudioRxParam*)calloc(1, sizeof(RkaudioRxParam));
param->model_rx_en = EN_RX_AGC | EN_RX_Anr | EN_RX_HOWLING;
param->anr_para = rkaudio_anr_param_init_rx();
param->howl_para = rkaudio_howl_param_init_rx();
param->agc_para = rkaudio_agc_param_init_rx();
return (void*)param;
}
typedef struct RKAUDIOParam_
{
int model_en;
void* aec_param;
void* bf_param;
void* rx_param;
int read_size;
} RKAUDIOParam;
inline static void rkaudio_aec_param_destory(void* param_)
{
SKVAECParameter* param = (SKVAECParameter*)param_;
free(param->delay_para); param->delay_para = NULL;
free(param); param = NULL;
}
inline static void rkaudio_preprocess_param_destory(void* param_)
{
SKVPreprocessParam* param = (SKVPreprocessParam*)param_;
free(param->dereverb_para); param->dereverb_para = NULL;
free(param->aes_para); param->aes_para = NULL;
free(param->anr_para); param->anr_para = NULL;
free(param->agc_para); param->agc_para = NULL;
param->nlp_para = NULL;
free(param->cng_para); param->cng_para = NULL;
free(param->dtd_para); param->dtd_para = NULL;
free(param->eq_para); param->eq_para = NULL;
free(param->howl_para); param->howl_para = NULL;
free(param->doa_para); param->doa_para = NULL;
free(param); param = NULL;
}
inline static void rkaudio_rx_param_destory(void* param_)
{
RkaudioRxParam* param = (RkaudioRxParam*)param_;
free(param->anr_para); param->anr_para = NULL;
free(param->howl_para); param->howl_para = NULL;
free(param); param = NULL;
}
inline static void rkaudio_param_deinit(void* param_)
{
RKAUDIOParam* param = (RKAUDIOParam*)param_;
if (param->aec_param != NULL)
rkaudio_aec_param_destory(param->aec_param);
if (param->bf_param != NULL)
rkaudio_preprocess_param_destory(param->bf_param);
if (param->rx_param != NULL)
rkaudio_rx_param_destory(param->rx_param);
}
void* rkaudio_preprocess_init(int rate, int bits, int src_chan, int ref_chan, RKAUDIOParam* param);
void rkaudio_param_printf(int src_chan, int ref_chan, RKAUDIOParam* param);
int rkaudio_Doa_invoke(void* st_ptr);
int rkaudio_Cir_Doa_invoke(void* st_ptr, int* ang_doa, int* pth_doa);
int rkaudio_preprocess_get_cmd_id(void* st_ptr, float* cmd_score, int* cmd_id);
int rkaudio_preprocess_get_asr_id(void* st_ptr, float* asr_score, int* asr_id);
int rkaudio_param_set(void* st_ptr, int rkaudio_enable, int rkaec_enable, int rkbf_enable);
void rkaudio_preprocess_destory(void* st_ptr);
int rkaudio_preprocess_short(void* st_ptr, short* in, short* out, int in_size, int* wakeup_status);
int rkaudio_rx_short(void* st_ptr, short* in, short* out);
int rkaudio_mdf_dump(void* st_ptr, short* out);
void rkaudio_asr_set_param(float min, float max, float keep);
int rkaudio_rknn_path_set(char* asr_rknn_path_, char* kws_rknn_path_, char* dns_rknn_path_);
void rkaudio_param_deinit(void* param_);
#ifdef __cplusplus
}
#endif
#endif // _RKAUDIO_PREPROCESS_H_

View File

@@ -0,0 +1,216 @@
/* Copyright (C) RK
Written by Ryne
Date : 20221214 v1.2.1*/
#ifndef RKAUDIO_SED_H
#define RKAUDIO_SED_H
#include <stdlib.h>
#ifdef __cplusplus
extern "C" {
#endif
#define JUMP_FRAME 20
#define FIR_EXAMPLE_LENC 4
static float FIR_EXAMPLE_COEFFS[FIR_EXAMPLE_LENC] = { -0.429928160597777, - 0.816685462483343, 0.816685462483343, 0.429928160597777 };
typedef struct RKSEDAGCParam_
{
/* <20>°<EFBFBD>AGC<47><43><EFBFBD><EFBFBD> */
float attack_time; /* <20><><EFBFBD><EFBFBD>ʱ<EFBFBD><EFBFBD><E4A3AC>AGC<47><43><EFBFBD><EFBFBD><EFBFBD>½<EFBFBD><C2BD><EFBFBD><EFBFBD><EFBFBD>Ҫ<EFBFBD><D2AA>ʱ<EFBFBD><CAB1> */
float release_time; /* ʩ<><CAA9>ʱ<EFBFBD><EFBFBD><E4A3AC>AGC<47><43><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>Ҫ<EFBFBD><D2AA>ʱ<EFBFBD><CAB1> */
float max_gain; /* <20><><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>棬ͬʱҲ<CAB1><D2B2><EFBFBD><EFBFBD><EFBFBD>Զ<EFBFBD><D4B6><EFBFBD><EFBFBD><EFBFBD><E6A3AC>λ<EFBFBD><CEBB>dB */
float max_peak; /* <20><>AGC<47><43><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>Χ<EFBFBD><CEA7><EFBFBD><EFBFBD>λ<EFBFBD><CEBB>dB */
float fRth0; /* <20><><EFBFBD>Ŷν<C5B6><CEBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>dB<64><42>ֵ<EFBFBD><D6B5>ͬʱҲ<CAB1><D2B2><EFBFBD><EFBFBD><EFBFBD>Զο<D4B6>ʼ<EFBFBD><CABC>ֵ */
float fRk0; /* <20><><EFBFBD>Ŷ<EFBFBD>б<EFBFBD><D0B1> */
float fRth1; /* ѹ<><D1B9><EFBFBD><EFBFBD><EFBFBD><EFBFBD>ʼ<EFBFBD><CABC><EFBFBD><EFBFBD>dB<64><42>ֵ<EFBFBD><D6B5>ͬʱҲ<CAB1><D2B2><EFBFBD><EFBFBD><EFBFBD>Զν<D4B6><CEBD><EFBFBD><EFBFBD><EFBFBD>ֵ */
/* <20><>Ч<EFBFBD><D0A7><EFBFBD><EFBFBD> */
int fs; /* <20><><EFBFBD>ݲ<EFBFBD><DDB2><EFBFBD><EFBFBD><EFBFBD> */
int frmlen; /* <20><><EFBFBD><EFBFBD>֡<EFBFBD><D6A1> */
float attenuate_time; /* <20><><EFBFBD><EFBFBD>˥<EFBFBD><CBA5>ʱ<EFBFBD><EFBFBD><E4A3AC><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>˥<EFBFBD><CBA5><EFBFBD><EFBFBD>1<EFBFBD><31><EFBFBD><EFBFBD><EFBFBD><EFBFBD>ʱ<EFBFBD><CAB1> */
float fRth2; /* ѹ<><D1B9><EFBFBD><EFBFBD><EFBFBD><EFBFBD>ʼ<EFBFBD><CABC><EFBFBD><EFBFBD>dB<64><42>ֵ */
float fRk1; /* <20><><EFBFBD>Ŷ<EFBFBD>б<EFBFBD><D0B1> */
float fRk2; /* <20><><EFBFBD>Ŷ<EFBFBD>б<EFBFBD><D0B1> */
float fLineGainDb; /* <20><><EFBFBD>Զ<EFBFBD><D4B6><EFBFBD><EFBFBD><EFBFBD>dB<64><42> */
int swSmL0; /* <20><><EFBFBD>Ŷ<EFBFBD>ʱ<EFBFBD><CAB1>ƽ<EFBFBD><C6BD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> */
int swSmL1; /* <20><><EFBFBD>Զ<EFBFBD>ʱ<EFBFBD><CAB1>ƽ<EFBFBD><C6BD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> */
int swSmL2; /* ѹ<><D1B9><EFBFBD><EFBFBD>ʱ<EFBFBD><CAB1>ƽ<EFBFBD><C6BD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> */
} RKSEDAGCParam;
inline static RKSEDAGCParam* rkaudio_sedagc_param_init()
{
RKSEDAGCParam* param = (RKSEDAGCParam*)malloc(sizeof(RKSEDAGCParam));
param->attack_time = 200.0;
param->release_time = 400.0;
param->max_gain = 30.0;
param->max_peak = -3.0;
param->fRk0 = 2;
param->fRth2 = -40;
param->fRth1 = -45;
param->fRth0 = -70;
param->fs = 16000;
param->frmlen = 256;
param->attenuate_time = 1000;
param->fRk1 = 0.8;
param->fRk2 = 0.4;
param->fLineGainDb = -25.0f;
param->swSmL0 = 40;
param->swSmL1 = 80;
param->swSmL2 = 80;
return param;
}
typedef struct RKFIRParam_ {
int fir_len;
float* fir_coeffs;
} RKFIRParam;
inline static RKFIRParam* rkaudio_fir_param_init()
{
RKFIRParam* param = (RKFIRParam*)malloc(sizeof(RKFIRParam));
param->fir_len = FIR_EXAMPLE_LENC;
param->fir_coeffs = FIR_EXAMPLE_COEFFS;
return param;
}
typedef struct SedAedParam_
{
float snr_db;
float lsd_db;
int policy;
float smooth_param;
} SedAedParam;
typedef struct SedParam_
{
int frm_len;
int nclass;
int babycry_decision_len;
int buzzer_decision_len;
int glassbreaking_decision_len;
float babycry_confirm_prob;
float buzzer_confirm_prob;
float glassbreaking_confirm_prob;
} SedParam;
typedef struct RKAudioSedRes_
{
int snr_res;
int lsd_res;
int bcd_res;
int buz_res;
int gbs_res;
} RKAudioSedRes;
typedef enum RKAudioSedType_ {
SED_TYPE_BCD = 1 << 0,
SED_TYPE_BUZ = 1 << 1,
SED_TYPE_GBS = 1 << 2,
} RKAudioSedType;
typedef enum RKAudioSedEnable_
{
EN_AGC = 1 << 0,
EN_AED = 1 << 1,
EN_SED = 1 << 2,
EN_FIR = 1 << 3,
} RKAudioSedEnable;
typedef struct RKAudioSedParam_
{
int model_en;
RKSEDAGCParam* agc_param;
SedAedParam *aed_param;
SedParam* sed_param;
RKFIRParam* fir_param;
} RKAudioSedParam;
static SedAedParam *rkaudio_sed_param_aed()
{
SedAedParam* param = (SedAedParam *)calloc(sizeof(SedAedParam), 1);
param->snr_db = 10;
param->lsd_db = -35;
param->policy = 1;
param->smooth_param = 0.9;
return param;
}
static SedParam* rkaudio_sed_param()
{
SedParam* param = (SedParam*)malloc(sizeof(SedParam));
param->frm_len = 90;
param->nclass = 1;
param->babycry_decision_len = 60;
param->buzzer_decision_len = 100;
param->glassbreaking_decision_len = 30;
param->babycry_confirm_prob = 0.85;
param->buzzer_confirm_prob = 0.5;
param->glassbreaking_confirm_prob = 0.98;
return param;
}
inline static RKAudioSedParam *rkaudio_sed_param_init()
{
RKAudioSedParam *param = (RKAudioSedParam *)calloc(sizeof(RKAudioSedParam), 1);
param->model_en = EN_AGC | EN_AED | EN_SED | EN_FIR;
param->agc_param = rkaudio_sedagc_param_init();
param->aed_param = rkaudio_sed_param_aed();
param->sed_param = rkaudio_sed_param();
param->fir_param = rkaudio_fir_param_init();
return param;
}
inline static void rkaudio_sed_param_destroy(RKAudioSedParam *param)
{
if (param == NULL)
return;
if (param->agc_param)
{
free(param->agc_param);
param->agc_param = NULL;
}
if (param->aed_param)
{
free(param->aed_param);
param->aed_param = NULL;
}
if (param->sed_param)
{
free(param->sed_param);
param->sed_param = NULL;
}
if (param->fir_param)
{
free(param->fir_param);
param->fir_param = NULL;
}
free(param);
}
void *rkaudio_sed_init(int fs, int bit, int chan, RKAudioSedParam *param);
char rkaudio_sed_init_res(void* st_);
void rkaudio_sed_destroy(void *st_);
int rkaudio_sed_process(void *st_, short *in, int in_size, RKAudioSedRes *res);
float rkaudio_sed_lsd_db(void *st_);
int rkaudio_sed_param_set(void* st_, void* param, int type);
int rkaudio_sed_bcd_model_set(char* sed_rknn_path);
#ifdef __cplusplus
}
#endif
/** @}*/
#endif

Some files were not shown because too many files have changed in this diff Show More