Apple Machine Learning Journal

  • However, to achieve high accuracy, the training sets need to be large, diverse, and accurately annotated, which is costly.
  • An alternative to labelling huge amounts of data is to use synthetic images from a simulator.
  • This is cheap as there is no labeling cost, but the synthetic images may not be realistic enough, resulting in poor generalization on real test images.
  • We show that training models on these refined images leads to significant improvements in accuracy on various machine learning tasks.
  • Read the article View the article “Improving the Realism of Synthetic Images”

Most successful examples of neural nets today are trained with supervision. However, to achieve high accuracy, the training sets need to be large, diverse, and accurately annotated, which is costly. An alternative to labelling huge amounts of data is to use synthetic images from a simulator. This is cheap as there is no labeling cost, but the synthetic images may not be realistic enough, resulting in poor generalization on real test images. To help close this performance gap, we’ve developed a method for refining synthetic images to make them look more realistic. We show that training models on these refined images leads to significant improvements in accuracy on various machine learning tasks.
Continue reading “Apple Machine Learning Journal”

MusicNet

MusicNet: A curated collection of labeled classical music  #music #machinelearning

  • Automatic music transcription, inferring a musical score from a recording, is a long-standing open problem in the music information retrieval community.
  • Identify precise onset times of the notes in a recording.
  • The MusicNet labels apply exclusively to Creative Commons and Public Domain recordings, and as such we can distribute and re-distribute the MusicNet labels together with their corresponding recordings.
  • MusicNet is a collection of 330 freely-licensed classical music recordings, together with over 1 million annotated labels indicating the precise time of each note every recording, the instrument that plays each note, and the note’s position in the metrical structure of the composition.
  • The labels are acquired from musical scores aligned to recordings by dynamic time warping .

MusicNet is a collection of 330 freely-licensed classical music recordings, together with over 1 million annotated labels indicating the precise time of each note every recording, the instrument that plays each note, and the note’s position in the metrical structure of the composition. The labels are acquired from musical scores aligned to recordings by dynamic time warping. The labels are verified by trained musicians; we estimate a labeling error rate of 4%. We offer the MusicNet labels to the machine learning and music communities as a resource for training models and a common benchmark for comparing results.

Continue reading “MusicNet”

Baidu open-sources Python-driven machine learning framework

Baidu open-sources Python-driven machine learning framework   #python

  • Training can be distributed across a cluster of machines , with or without GPUs.
  • Chinese search engine giant Baidu now has an open source project in the same vein: a machine learning system it claims is easier to train and use because it exposes its functions through Python libraries.
  • Many of the latest machine learning and data science tools purport to be easy to work with compared to previous generations of such frameworks and libraries.
  • The user can program directly to the C++ libraries, but PaddlePaddle provides a Python library, PyDataProvider2 , that removes much of the heavy lifting from the training process.
  • Baidu has plans in the works for adding support for other languages when performing predictions, but Xu said there are currently no intentions to support anything other than Python for model training. “

Baidu employs the PaddlePaddle framework internally for prediction systems, along with Python to make training models and deriving predictions a snap
Continue reading “Baidu open-sources Python-driven machine learning framework”