Text-to-Speech and AI Music

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Information Processes".

Deadline for manuscript submissions: 14 May 2025 | Viewed by 45

Special Issue Editor


E-Mail Website
Guest Editor
School of Computer Science and Technology, Fudan University, Shanghai 200433, China
Interests: text to speech; voice conversion; talking face; music AI; large model

Special Issue Information

Dear Colleagues,

The field of audio research has long been dominated by two primary areas of study: speech and music. With the rapid advancement in deep learning and the emergence of large-scale models, these domains are now facing heightened demands and a plethora of new challenges. This Special Issue aims to bring together researchers working on speech synthesis and music artificial intelligence (AI) to contribute their insights and research findings, collectively propelling the development of both fields. The scope of this Special Issue is broad and encompasses a range of topics that are at the intersection of audio technology and AI. We welcome submissions that address, but are not limited to, the following research directions:

  • Speech synthesis: the creation of natural-sounding speech from text using AI models;
  • Voice conversion: techniques for altering the voice characteristics of one speaker to match those of another;
  • Talking face generation: the synthesis of visual speech movements in faces that correspond with the generated speech;
  • Melody extraction: algorithms for extracting the main melody from complex musical compositions;
  • Vocal accompaniment separation: methods for isolating vocals from an instrumental accompaniment in a musical piece;
  • Singing voice synthesis: the generation of singing voices from lyrics or melodies;
  • Automatic music composition: AI-driven processes for creating original musical pieces;
  • Humming recognition: technologies that identify songs based on a user's humming;
  • Music AI: broad explorations into how AI can innovate and enhance music creation, performance, and interaction.

Objectives and Goals

The primary objectives of this Special Issue are to do the following:

  • Showcase the latest research and developments in speech synthesis and music AI;
  • Foster interdisciplinary collaboration among researchers, engineers, and industry professionals;
  • Encourage the submission of high-quality, original research that addresses current challenges and proposes innovative solutions;
  • Disseminate knowledge and promote the exchange of ideas to inspire future research directions;
  • Facilitate the integration of theoretical advancements with practical applications in the audio industry.

Dr. Xulong Zhang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at mdpi.longhoe.net by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • text to speech
  • voice conversion
  • talking face
  • singing voice synthesis
  • melody extraction
  • music AI

Published Papers

This special issue is now open for submission.
Back to TopTop