The Computational Complexity of Music

Angela Chu
6 min readMay 19, 2021

Have you ever wondered what the complexity of arranging music meant for several instruments down to a single instrument is? If so, this article is for you. If not, maybe you’ll find this to be an interesting read regardless. Today, we’re going to break down a report composed by William S. Moses and Erik D. Demaine.

Before we get to the meat of the proof, we need to delve into some background information. Moses and Demaine hypothesized that some problems involving music arrangements were computationally hard to solve, while others were easy in comparison.

For example, the edit distance between two songs, or the minimum number of primitive edit operations that transform one song to another, and the optimal fingering, the best finger to play each note, can be computed in polynomial time.

Moses and Demaine claimed that various constraints, which we’ll dive into later in this article, affect the computational complexity of arranging music originally written for one set of instruments down to a single instrument.

For those of you who haven’t worked with computational complexity, let’s conduct a brief overview of some important vocab terms. Algorithms for solving problems are classified into several complexity classes, which include but are not limited to P (Polynomial-time), NP (Non-deterministic Polynomial-time), and EXP (Exponential time).

If a problem is in P, that means it is computable in polynomial time. If a problem is in NP, that means that given a certificate that a word w is in the language, we can verify that it is in polynomial time, but not necessarily solve it in polynomial time. If a problem is in NP-Complete, that means it is in both NP and is NP-hard.

Moses and Demaine defined constraints of two types: universal and specific. Specific constraints depend on the limitations for the artist playing the instrument, which means they vary by instrument or artist. These constraints are consonance/dissonance, the maximum number of notes that appear in chords, and speed limitations (these restrict how quickly notes can be changed). Consonance refers to music that is pleasant to the ears, and dissonance refers to music that is unpleasant.

Universal constraints, on the other hand, are constraints that can be applied to arranging music in the general sense. For these constraints, melodies are not allowed to be split. That is, the pairs of notes and times associated with each part Tᵢ are either all included are not at all. In addition, a certain percentage of notes are required to be included in the new arrangement. That is, if the percentage p = 50%, then 50% of notes played at a time tᵢ in the original song are still played in the arrangement.

For the purposes of the report, Moses and Demaine defined chords as consonant or dissonant based on the number of half-steps (notes) in between the chords. They supply the definition below:

If the number of half steps between the notes follows the top formula, the chord is in consonance, and if the number of half steps between the notes follows the bottom formula, the chord is in dissonance.

And below are a few examples of consonant intervals:

These chords are played in the video below. As expected, they sound pleasant to the ear.

On the contrary, here are some sample dissonant chords:

If you want to learn more, you can read about consonant and dissonant intervals here.

We now can dive into the heart of the problem. Although Moses and Demaine devised several proofs that focused on several constraints, for the purpose of keeping this article relatively brief we shall only go into the first one, which focuses on the complexity of creating an arrangement where all chords are in consonance (essentially, a piece that “sounds good”).

They claimed that this problem is NP-Hard, that is, it is at least as hard as the hardest problems in NP. The proof idea is to reduce on 3SAT, a known NP-Complete problem. The constraints for this proof are the following: no splitting melodies, no pairs of notes being played at the same time are in dissonance, and at any given time, at least p notes from the original song need to be played (Assume p = 50%).

For those of you who aren’t familiar with 3SAT, let’s do a brief overview of what it is.

3SAT is a language that consists of boolean formulas that are in the 3CNF form.

The language 3SAT consists of any boolean formulas in 3CNF that are satisfiable, meaning that there is some assignments for the variables x1 … xn such that the entire formula is true. Although we won’t be proving 3SAT is NP-Complete, we can do so by reducing on SAT, which is the language that consists of all satisfiable boolean formulas.

To reduce the arranging music problem down to 3SAT, we can start by creating a variable gadget by adding two parts per gadget (either True or False). A measure is dedicated to each variable, where two instruments are playing a note. An example is shown below:

Let x₁ = d (the top note). It follows that the bottom note, d flat, is not x₁. If x₁ is set to true, then the bottom note will be false.

Since p= 50%, we are forced to play only one of these parts, since playing both would result in 100% (there is no in-between). It follows that the variable must be either true or false.

Recall that one of the constraints was that we had to include each part entirely or not at all (think all or nothing), so we can make the parts that are true include notes and the parts that are false include no notes.

A clause gadget can be created by having a measure where only the true literal and the corresponding variables in the 3SAT clause are playing notes. The requirement that at least 50% of the notes need to be played translates to at least one of the literals being true.

To generalize the proof, we can have p take on any rational value between 0 and 1. This can be done by padding the aforementioned gadgets with the correct number of true and false literals to make the original score correspond to the selected p value.

The image below demonstrates an example reduction where p = 50%:

The first two measures initialize the variables, the next two represent the formula, and the last four measures represent the solution.

While this method of arranging music certainly suffices, it seems very limited and would probably produce a simplified version of songs that are more musically challenging than the typical piece. A more complex piece would also result in a large number of variables, which doesn’t seem to be the most efficient.

The reduction from 3SAT was clever, but it seems the solution to the problem is a bit limited in its scope. This is because the constraints chosen wouldn’t necessarily apply to all arrangements of music. For example, not all chords in music are in consonance 100% of the time. There are many pieces that contain dissonant chords that sound pleasant to the ear.

The main takeaways? Using Moses and Demaine’s reduction on 3SAT, we now know that arranging music is computationally hard. In addition, this problem belongs to the complexity class NP, which makes the problem NP-Complete.

You’re also probably wondering: what are some real-world applications of this finding? Well, this can be applied to music games such as Rock Band or Guitar Hero, where pieces are arranged so that they can be played on four notes on a faux instrument. Dance Dance Revolution is another prime example. This problem can also be applied to musical choreography; more specifically, ballet and ice skating, where the moves would represent the notes on the instrument.

Finally, here are some closing questions: what are your thoughts on this reduction? Do you play an instrument? Let me know in the comments!

--

--