Augmenter that apply pitch adjustment operation to audio.

class, zone=(0.2, 0.8), coverage=1.0, duration=None, factor=(-10, 10), name='Pitch_Aug', verbose=0, stateless=True)[source]


  • sampling_rate (int) – Sampling rate of input audio.
  • zone (tuple) – Assign a zone for augmentation. Default value is (0.2, 0.8) which means that no any augmentation will be applied in first 20% and last 20% of whole audio.
  • coverage (float) – Portion of augmentation. Value should be between 0 and 1. If 1 is assigned, augment operation will be applied to target audio segment. For example, the audio duration is 60 seconds while zone and coverage are (0.2, 0.8) and 0.7 respectively. 42 seconds ((0.8-0.2)*0.7*60) audio will be augmented.
  • duration (int) – Duration of augmentation (in second). Default value is None. If value is provided. coverage value will be ignored.
  • factor (tuple) – Input data volume will be increased (decreased). Augmented value will be picked within the range of this tuple value. Volume will be reduced if value is between 0 and 1.
  • name (str) – Name of this augmenter
>>> import as naa
>>> aug = naa.PitchAug(sampling_rate=44010)
augment(data, n=1, num_thread=1)
  • data (object/list) – Data for augmentation. It can be list of data (e.g. list of string or numpy) or single element (e.g. string or numpy)
  • n (int) – Default is 1. Number of unique augmented output. Will be force to 1 if input is list of data
  • num_thread (int) – Number of thread for data augmentation. Use this option when you are using CPU and n is larger than 1

Augmented data

>>> augmented_data = aug.augment(data)