With AI composition becoming popular now, will the entry barrier for ordinary people into the music industry be lowered in the future?

The Disruption Effect and Emotional Connection of AI Products in the Music Industry

From Tianyin onwards, various AI products claiming to disrupt the music industry have stepped onto the stage, but how effective are they (→_→)?

In my opinion, if you don’t understand the basic logic of the music industry, you shouldn’t discuss industry disruption.

The biggest difference between music and other industries is that music is not a necessity but a weak demand. In order to generate commercial value, it must establish an emotional connection with the target audience. To make a profit, a strong emotional connection must be established.

If you just want to write music that sounds alright, it was completely possible to use “Band-in-a-Box” before, and now you can use Splice to sample and create a song. But how many people can make a profit from that?

Software has already been used to imitate Bach’s counterpoint, but ordinary people have no reason to listen to these Bach-like compositions because there is no emotional connection. Moreover, many of Bach’s works can be listened to for free, so how can you justify charging others for music that you generate?

Naturally, you can say that you copy a lot of works and let algorithms match them, taking the algorithmic approach. Many AI music companies are doing this, but what does it have to do with ordinary people? Individuals cannot afford to produce music on a large scale like companies do. Furthermore, this approach is being scrutinized now, and various streaming software are actively curbing such behavior.

If individuals want to enter the music industry, producing a complete piece of work is only the first step. The real challenge lies in how to establish an emotional connection between their own work and the target audience. Moreover, the extent of generating emotional connections is highly contingent. However, AI music is not currently addressing this. If your personal expression hasn’t undergone long-term training and screening, it is impossible to reach the level of emotional connection with the audience. Moreover, those who have undergone long-term training, wouldn’t they use AI as well? This is not about ordinary people resisting AI completely, isn’t it?

Certain AI teams should not claim to crush grassroots musicians while exploiting ordinary people.

AI composition is difficult to popularize. Currently, the level of AI is still in the stage of producing electronic trash.

As a professional in the music industry, I have used mainstream AI composition software, and based on my experience, I can only say that ordinary people without music knowledge can only differentiate whether it sounds good or bad. They have no idea whether the melody created by AI has the potential for modification, let alone how to modify it.

However, it must be said that AI has greatly shortened the tedious workflow. The workflow involves inputting some basic parameters, generating ten or eight melodies, selecting one, making some changes, and it can somewhat resemble a finished song. The efficiency is indeed very high, and it can basically meet the requirements for small projects with not very high demands. You can refer to this video for specific operations:

Although AI’s current ability is lower than the overall human level, when it comes to the lower limit, AI’s lower limit is higher than that of humans (after all, there are people who completely don’t understand music theory or have no strong musical sense). In addition, for non-human-performed music where there is no difference in performance, it becomes difficult to distinguish the overlapping part of the compositional abilities between AI and humans.

In order for ordinary people to enter the field, AI must first reach the average level of industry professionals. But the trend is irreversible, and it’s hard to say what will happen when AI’s level catches up with humans in the future. As professionals, it is better to embrace the changes as early as possible.

Below is a comparison of the functionalities of Suno, VoiceMod, MusicLM, Rifussion, NetEase Tianyin, Changya, and QQ Music AI.

In my opinion, Suno is currently the strongest, supporting the most language types and covering mainstream music styles that the general public is familiar with. On the other hand, Changya has the lowest learning curve and slightly inferior quality compared to Suno, but it has a relatively higher lower limit and produces more stable results. It is a good music AI product in China.

The following demonstrates the generated results of each AI composition tool:

SUNO:

Let’s first look at a good example from Suno. It generates decent quality in the Chinese pop style, although the sound quality is a bit lacking and the synthesized vocals have a strong artificial feeling. However, the overall listening experience does not have strange or jarring melodies, and it has the rudimentary form of a music demo for publishing.

The problem with Suno is that it often glitches, swallowing or mispronouncing words, and mixing in other languages, especially when there is no vocal part.

Rifussion:

Rifussion is a newly emerged AI. Its quality is comparable to Suno, but it does not support multiple languages. At the same time, the genre mainly focuses on rap and pop. The generated audio is only about 10 seconds long, which is too short and even insufficient in length to be considered a demo.

These two foreign AIs require a VPN for users who are unable to access them due to geographical restrictions. If that’s the case, you can consider using domestic AIs.

In China, the most famous AIs are NetEase Tianyin, Changya, Lingdong, ACE, and QQ Music, which also has AI functionality, mainly for album covers, so it’s relatively limited. Here, we will mainly evaluate two AIs that focus on composition.

NetEase Tianyin:

Tianyin is a PC-based AI tool that requires users to be registered musicians on NetEase in order to use it. The user interface is similar to a simplified version of professional composition software with a certain learning curve. However, correspondingly, it offers more freedom and finer editing granularity. Currently, it does not support actively inputting lyrics, and the vocal and accompaniment synchronization is average. The generated results are as follows:

Changya:

Changya’s app mostly targets ordinary users and is available for free download on major app stores. Changya uses keyword input to output music generation logic. In addition to lyrics composition, melody creation, and trial singing, AI also has the ability to generate album covers. Music parameters are also input as text, and the app provides default presets that can be directly used even without understanding them. When inputting lyrics similar to NetEase’s, Changya’s generated output blends vocals and accompaniment more effectively, resulting in a more natural overall effect. The generated results are as follows:

Furthermore, there are also some lesser-known composition software in China, but we won’t go into detail about them here.

Currently, some individual AIs have approached the level of professionally published popular songs. As long as the techniques and patterns that can be clearly described exist, AI will definitely be able to replicate them in the future. However, the market for songwriting is too small for ordinary people, which is why the development of AI composition is relatively slower compared to AI painting and AI writing. In the future, AI music and human music will likely compete on the same stage, each occupying half of the territory.

Next
Previous