Anatomy of Voice and Swallowing
The human voice as we hear it is really produced by the whole body. Much like a musical instrument, production of voice requires air pressure, a vibratory surface, and a resonating chamber. The air pressure is generated from the lungs, chest wall muscles, rib cage, diaphragm, and abdominal muscles. Without adequate air pressure (for example, poor breath support in some lung diseases, neurologic diseases, or spinal cord injuries) the rest of the voice production system breaks down. Singers and other professional voice users focus a significant amount of training on control of breath support.
The vibratory portion of voice production occurs in the voice box, or larynx. The larynx is a complex organ made of paired muscles and cartilages. The thyroid cartilage houses and supports the vocal folds (also called vocal cords). It is a large single cartilage with a prominence at the most superior and anterior aspect. This thyroid notch is more prominent in the male larynx and is often called the “Adam’s apple.” The thyroid cartilage sits atop the cricoid cartilage, which is a signet ring shaped cartilage. Below the cricoid cartilage is the trachea, or windpipe. The subglottis extends from just below the vocal folds to about 1 cm into the windpipe (trachea). The hyoid bone is superior to the thyroid cartilage and provides additional support to the larynx. The paired arytenoid cartilages comprise the joints primarily responsible for vocal fold movement. These are seen as prominences indicated in this endoscopic picture of the larynx.
Multiple pairs of intrinsic laryngeal muscles are responsible for vocal fold movement. Although the movement of these muscles is actually quite complex, the primary actions of the motions can be simplified to obtain a general understanding of their actions. The paired thyroarytenoid muscles make up the bulk of the vocal fold and cause the vocal folds to come together during voice production. The paired cricothyroid muscles extend from the cricoid cartilage to the thyroid cartilage and serve to elongate the vocal folds, especially during high-pitched voicing. The paired posterior cricoarytenoid muscles are the only muscles which actively open the vocal folds. The single intrarytenoideus muscle helps to stabilize the vocal folds during voicing. Other muscles outside of the larynx also contribute to voice production. These are collectively known as the strap muscles.
The larynx is innervated by branches of cranial nerve X, the vagus nerve. The vagus nerve leaves the brainstem on each side of the skull and travels alongside the carotid artery into the chest. In the neck, the vagus nerve gives off a branch to each side of the larynx called the superior laryngeal nerve. The superior laryngeal nerve innervates the cricothyroid muscles and provides most of the sensation to the larynx. In the chest and abdomen, the vagus nerve innervates the esophagus and stomach. On the right side, the vagus nerve loops around the subclavian artery before returning to the neck as the right recurrent laryngeal nerve. On the left side, the vagus nerve loops around the arch of the aorta before returning to the neck as the left recurrent laryngeal nerve. On both sides, the recurrent laryngeal nerves travel alongside the trachea and esophagus, deep to the thyroid gland, to reach the larynx. The recurrent laryngeal nerves innervate all of the intrinsic muscles of the larynx except the cricothyroid muscles and provide sensation to the subglottis.
The vocal folds themselves are multilayered structures. The vocal folds meet anteriorly to form a “V” known as the anterior commissure. The bulk of the vocal fold is composed of the thyroarytenoid muscle, extending from the interior surface of the anterior thyroid cartilage to the arytenoid cartilage. This muscular layer makes up the deepest layer of the vocal fold. The vocal ligament is next to the muscle and is composed of the deep and intermediate layers of the lamina propria. The next most superficial layer is critical to human voice production: the superficial lamina propria. This gelatinous layer allows the mucosal cover of the vocal folds vibrate over the “body” of the vocal folds (vocal ligament and muscle). When this layer is compromised, the vocal folds can not vibrate appropriately, and the lining, or mucosal cover, of the vocal folds may scar to the underlying vocal ligament.
Vocal fold vibration causes a simple buzzing sound that only become recognizable as human voice as it passes through the resonating system. This is composed of every structure in the airway from the vocal folds to the outside world, including the throat (pharynx), mouth, sinuses, and nose. The tongue, palate, and lips allow for articulation (of note, it is important to recognize the differences between speech, produced by the articulators, and voice). Any problem in the resonating system, such as nasal congestion or enlarged tonsils, can affect the resonance of the voice.
The swallowing process begins in the mouth and involves the teeth, tongue, facial muscles, lips, and saliva glands. This preparatory phase involves making the food bolus smaller, mixing it with saliva, and preparing it for transport to the throat. This phase of swallowing is under voluntary control. Saliva helps make the food bolus more slippery and contains enzymes that begin the process of digestion. Medicines or conditions that affect saliva production or consistency may impair this phase of swallowing. The preparatory phase is also affected by lack of dentition (i.e. poor fitting dentures or no teeth), inability to keep the mouth closed (i.e., lip cancer, facial nerve paralysis), and impaired tongue movement (i.e., head and neck cancer, stroke).
Once the food reaches the back part of your tongue, a swallow is triggered (bolus refers to the liquid or food being swallowed). During the oral phase, the anterior part of the tongue pushes against the roof of the mouth and the palate elevates. This provides a passageway into the throat and keeps the food bolus from going into the nasal cavity.
During the pharyngeal phase, the food bolus passes through the throat and into the esophagus. This phase is involuntary and irreversible once begun. Swallowing and breathing are intricately coordinated during this process. The back part of the tongue pushes the food bolus into the throat. The voice box elevates and moves forward as the throat elevates and creates a peristaltic contraction wave to move the food down. The epiglottis flips over the vocal cords as the vocal cords close tightly to keep food and drink from going into the airway. As the voice box moves forward, the upper esophageal sphincter opens to allow food to pass into the esophagus. The upper esophageal sphincter (or UES) is primarily composed of the cricopharyngeus muscle, which is always contracted until it relaxes and opens during the swallow.
The esophageal phase of swallowing is also completely involuntary. The esophagus is a long collapsed tube whose sole purpose is to transport food and drink from the throat to the stomach for digestion. The most proximal part of the esophagus is made of skeletal muscle; the next several centimeters are made of both skeletal and smooth muscle; and the distal half of the esophagus is all smooth muscle. Primary peristalsis is the sequential contraction of the esophagus from the top to the bottom. This creates a stripping effect to move the bolus down into the stomach. The lower esophageal sphincter (LES) is a thickening of the muscle of the esophagus at the level of the diaphragm. Both the tonic contraction of the esophageal muscle and the constriction by the diaphragm keep the LES closed most of the time. When the swallow is triggered, the LES opens to allow the bolus to pass into the stomach. Secondary peristalsis occurs in response to reflux of material from the stomach back into the esophagus and to residual bolus left in the esophagus after the primary stripping wave. Tertiary contractions are not peristaltic and often occur in elderly individuals.