Tail-anchored (TA) proteins play essential roles in mammalian cells, and their accurate localization is critical for proteostasis. Biophysical similarities lead to mistargeting of mitochondrial TA proteins to the ER, where they are delivered to the insertase, the ER membrane protein complex (EMC). Leveraging an improved structural model of the human EMC, we used mutagenesis and site-specific crosslinking to map the path of a TA protein from its cytosolic capture by methionine-rich loops to its membrane insertion through a hydrophilic vestibule. Positively charged residues at the entrance to the vestibule function as a selectivity filter that uses charge-repulsion to reject mitochondrial TA proteins. Similarly, this selectivity filter retains the positively charged soluble domains of multipass substrates in the cytosol, thereby ensuring they adopt the correct topology and enforcing the “positive-inside” rule. Substrate discrimination by the EMC provides a biochemical explanation for one role of charge in TA protein sorting and protects compartment integrity by limiting protein misinsertion.
A hallmark of eukaryotic cells is their organization into subcellular compartments that spatially separate otherwise incompatible biochemical reactions. The evolution of compartmentalization enabled the increasingly complex cellular processes required for emergence of multicellular life. To carry out distinct functions, each compartment must contain a unique and precisely defined set of proteins and metabolites.
Membrane proteins comprise ∼20% of the human proteome (Krogh et al., 2001), and their localization is a primary determinant of organellar identity, underscoring the importance of their accurate sorting. Due to the presence of one or more hydrophobic transmembrane domains (TMDs), targeting and insertion of membrane proteins must be tightly regulated to prevent their aggregation in the aqueous cytosol. Canonical localization of many membrane proteins to mitochondria and the ER relies on cleavable targeting sequences that direct proteins to the correct organelle. Both the mitochondrial targeting sequence and the ER-specific signal sequence are proteolytically removed upon arrival at their respective compartment, and thus have evolved principally to ensure accurate sorting without the need to serve a functional role in the mature protein.
However, given the functional and topological diversity of the membrane proteome, many nascent proteins cannot utilize these stereotypical biogenesis pathways. In these cases, membrane proteins instead rely on recognition of a TMD and its surrounding residues for accurate sorting (Rapoport et al., 2017; Guna and Hegde, 2018). These sequences must therefore play dual roles, experiencing evolutionary pressure to both function in the mature protein (i.e., insertion, folding, and assembly) and ensure accurate localization.
One important family of membrane proteins that rely on their TMD and its flanking residues for recognition, targeting, and insertion are tail-anchored (TA) proteins (Kutay et al., 1993; Chio et al., 2017; Hegde and Keenan, 2011; Guna et al., 2022a). TA proteins are characterized by a single C-terminal TMD followed by a short soluble domain of up to 30–40 amino acids. Their globular N-termini are localized to the cytosol and are responsible for carrying out their diverse functions. Because of their topology, the TMD of a TA protein emerges from the exit tunnel of the ribosome only after translation termination, and they must be post-translationally targeted to the correct organelle. TA proteins are found on all cellular membranes and regulate essential processes such as neurotransmitter release via exocytosis (SNARE proteins), cholesterol synthesis at the ER (squalene synthase [SQS]), and the onset of apoptosis at mitochondria (BCL-2, Bak). Given their biophysical diversity and the limited information for targeting, how TA proteins are accurately sorted between compartments has been a long-standing open question in the field.
TA protein localization is thought to be primarily dictated by two features: (i) properties of the TMD including its hydrophobicity and helical propensity, and (ii) properties of the C-terminal soluble domain that must be translocated across the bilayer during insertion (Costello et al., 2017; Fry et al., 2021; Kalbfleisch et al., 2007). TA proteins with highly hydrophobic TMDs are preferentially targeted to the ER membrane for insertion via the guided entry of tail-anchored protein (GET) pathway (Schuldiner et al., 2005, 2008). Its central targeting factor in human cells, GET3 (Stefanovic and Hegde, 2007; Favaloro et al., 2008), binds TMDs using an ordered methionine-rich substrate binding groove and delivers its substrate TA proteins to the GET1/2 insertase for membrane integration (Mariappan et al., 2011). TA proteins with lower hydrophobicity TMDs, however, do not efficiently bind GET3 and thus cannot access the GET pathway (Guna et al., 2018). The largest classes of such low hydrophobicity TA proteins are those targeted to the ER, where they are inserted by the ER membrane protein complex (EMC; Jonikas et al., 2009; Christianson et al., 2011; Guna et al., 2018), and those targeted to the outer mitochondrial membrane, where they are inserted by MTCH1 and 2 (Guna et al., 2022b). Because of their biophysical similarity, there is thought to be some constitutive levels of mistargeting between these compartments, necessitating dedicated quality control machinery at the ER and mitochondria to extract mislocalized TA proteins (Chen et al., 2014; Okreglak and Walter, 2014; McKenna et al., 2020).
Because functional constraints limit the potential diversity of the TMD alone, a second sequence element, the short polar C-terminal domain, is known to contribute to TA protein sorting (Isenmann et al., 1998; Kuroda et al., 1998; Borgese et al., 2007). Although biophysically diverse, mitochondrial TA proteins are enriched for positive charges in their C-terminal tails, while the C-termini of ER targeted TA proteins are more likely to be net neutral or negatively charged. Manipulation of C-terminal charge is known to be sufficient to shift the localization of TA proteins between the ER and mitochondria (Horie et al., 2002; Rao et al., 2016; Costello et al., 2017). However, the biochemical basis for how changes in charge can alter TA protein sorting is fundamentally not clear. Considering recent advances in our mechanistic understanding of TA protein insertion into the ER (Pleiner et al., 2020; Bai et al., 2020; O'Donnell et al., 2020; Miller-Vedam et al., 2020), we sought to re-examine the molecular basis for sorting specificity between mitochondrial and ER TA proteins at this cellular compartment.
Selectivity at the ER membrane
Previous studies of the canonical co-translational insertion pathway suggest that sorting fidelity is the combined result of contributions from cytosolic targeting steps and selectivity at the membrane (Trueman et al., 2012; Akopian et al., 2013; Jomaa et al., 2022). In the case of TA proteins, the source of this specificity at either step has remained elusive. While specificity during cytosolic targeting must undoubtedly contribute to TA protein localization, we found that even when loaded onto the identical chaperone in vitro, some mitochondrial TA proteins cannot be efficiently inserted into the ER membrane (Fig. 1 A). This selectivity appeared to correlate with C-terminal charge, because when positively charged amino acids were introduced within the C-terminus of the canonical ER TA protein SQS, its insertion efficiency was dramatically diminished. Based on these observations, we concluded that there must be a source of substrate discrimination directly at the ER membrane, with selectivity occurring at the insertion step.
The EMC is the major insertase for ER-destined TA proteins with lower hydrophobicity TMDs, which are similar to those of mitochondrial TA proteins. Consistent with this biophysical similarity, we and others have demonstrated that the EMC is responsible for misinsertion of mitochondrial TA proteins into the ER (Fig. 1, B–D; Guna et al., 2022b; McKenna et al., 2022). Using an established split GFP system to specifically query TA integration into the ER (Fig. 1 B; Inglis et al., 2020), we found that multiple mitochondrial TA proteins were misinserted in an EMC, but not GET1/2, dependent manner (Fig. 1, C and D). We therefore reasoned that one source of discrimination against TAs with positively charged C-termini at the ER, either mitochondrial or the SQS mutants, must originate from properties of the EMC.
Substrate TMDs physically associate with the EMC’s hydrophilic vestibule
With the goal of determining the biochemical basis of EMC’s substrate specificity, we sought to map the path of a TMD from the cytosol into the bilayer through the EMC. Structures of the yeast and mammalian EMC identified two intramembrane surfaces that could potentially catalyze TMD insertion: a hydrophilic vestibule that positions several conserved positively charged residues within the cytosolic leaflet of the bilayer, and a hydrophobic crevice that contains a large lipid-filled wedge within the membrane (Pleiner et al., 2020; Bai et al., 2020; O’Donnell et al., 2020; Miller-Vedam et al., 2020). Site-specific crosslinking experiments previously identified EMC3 as the major substrate interaction partner within the purified EMC (Pleiner et al., 2020), consistent with EMC3’s homology with other members of the Oxa1 superfamily of insertases (Anghel et al., 2017). However, the path of a substrate TMD has never been directly determined, and potential contributions to insertion from both intramembrane surfaces of the EMC have been proposed.
To map direct physical association of substrates with the EMC, we exploited several independent zero-length crosslinking approaches to chart substrate interaction at single-residue resolution. First, we introduced the site-specific crosslinker 4-Benzoylphenylalanine (BpA) into the TMD of a canonical EMC TA substrate and identified UV-dependent crosslinks to both EMC3 and EMC4 by immunoprecipitation (IP; Fig. S1 A). Unlike EMC3, which is present on both sides of the complex, the cytosolic and intramembrane surfaces of EMC4 partially enclose only the hydrophilic vestibule, suggesting substrates must at least transiently localize with this side of the EMC. Second, we exploited the fact that endogenous EMC3 does not contain any naturally occurring cysteine residues to perform disulfide crosslinking between a TA protein and the EMC. Because disulfide-bond formation can only occur between residues within 3–5 Å of each other, productive crosslinking necessarily indicates a direct physical association. Zero-length disulfide formation between single cysteines introduced at defined positions in EMC3, and a unique cysteine at two different positions within a substrate TMD, identified a strong preference for substrate binding to the hydrophilic vestibule of detergent-solubilized EMC (Fig. 2, A and B; and Fig. S1 B). A similar preference was observed when comparing matched positions on either side of EMC3 at the base of the membrane. This preferential crosslinking was independent of cysteine position within the substrate TMD (Fig. S1 C) and was also observed upon incorporation of orientation-independent photo-crosslinkers in EMC3 (Fig. S1 D).
Finally, and most definitively, we developed a strategy to capture the transient interaction between a substrate TMD and the EMC by disulfide crosslinking in native, insertion competent, ER membranes (Fig. 2 C and Fig. S1 E). Using this approach, we again observed a marked preference for interaction of TA proteins with the hydrophilic vestibule of EMC3 compared to the hydrophobic crevice. In native membranes and with purified EMC, substrates preferentially crosslinked to a cytosol-facing position on EMC3 at the entrance to the lipid bilayer, suggesting a potential increase in dwell time at this location.
To further exclude that the opposite hydrophobic crevice is involved in TA protein insertion, we introduced multiple mutations to polar and hydrophobic residues in this region and found that they are all dispensable for TA protein biogenesis in human cells (Fig. S1, F–H). These data, in combination with sequence conservation, homology to Oxa1 superfamily insertases, and mutational analysis, definitively identify the hydrophilic vestibule as the insertase competent module of the EMC.
An improved model of the EMC defines intramembrane surfaces required for insertion
Having identified the hydrophilic vestibule as the major site of substrate binding to the EMC, we sought to better define its architecture and thereby identify potential sources of substrate specificity. The insertase core of the EMC (composed of EMC3 and 6) is partially enclosed by the dynamic subunits EMC4, 7, and 10. However, whether EMC7 and 10 contain TMDs, how these may be positioned, as well as the specific contributions of all three auxiliary subunits was incompletely defined.
To characterize the biophysical properties of the hydrophilic vestibule we obtained an improved cryo-EM reconstruction of the human EMC that allowed us to unambiguously assign and position the three TMDs of EMC4 and the single TMDs of EMC7 and 10 (Fig. 3 A, Fig. S2, and Fig. S3, A–C; and Table S1). In support of this model, we biochemically confirmed that human EMC7 and 10 both contain single C-terminal TMDs that span the lipid bilayer (Fig. S3 D). Examination of the roles of these subunits suggested that, consistent with previous studies, EMC4 and 7, but not 10, are required for TA protein biogenesis (Fig. S3 E; Louie et al., 2012; Volkmar et al., 2019; Lakshminarayan et al., 2020). These auxiliary subunits do not play an architectural role in complex stability, as their depletion did not affect assembly of the core EMC subunits (EMC1, 2, 3, 5, 6, 8; Fig. S3 F). However, we additionally found that complete loss of EMC4 impaired the assembly of EMC7 and 10 into the EMC. Because EMC4’s C-terminal β-strand completes the membrane-proximal β-propeller of EMC1, it is possible that loss of EMC4 disrupts the lumenal binding sites of EMC7 and 10. We concluded that the hydrophilic vestibule formed by the TMDs and cytosolic loops of EMC3 and 6 is partially enclosed by the five dynamic TMDs of EMC4, 7, and 10.
Capture of substrate TA proteins in the cytosol by the EMC
Based on this improved model of the EMC, we determined that the cytosolic loops of EMC3 and 7 are positioned immediately below the hydrophilic vestibule, making them prime candidates for cytosolic capture of substrates. We had previously shown that the flexible loops of EMC3 contain conserved methionine residues, commonly found in the TMD binding domains of cytosolic chaperones, that were important for EMC function (Pleiner et al., 2020). We therefore hypothesized that the loops of EMC3 and 7 could be involved in physically interacting with substrate TMDs in the cytosol. We set out to test key facets of this working model, with the goal of understanding whether the molecular details of substrate capture could contribute to discrimination between ER and mitochondrial TA proteins.
Consistent with earlier data, we found that methionine residues within the cytosolic loop of EMC3 were essential for TA protein biogenesis in cells (Fig. 3 B and Fig. S4, A and B). Similarly, we found that the flexible C-terminus of EMC7 was required for EMC function (Fig. 3 C and Fig. S4, C–F). Deletion of 12 residues to disrupt a predicted amphipathic α-helix, but not deletion of a matched upstream α-helix, strongly impaired SQS biogenesis, nearly phenocopying EMC7 knockout. We further demonstrated that the hydrophobicity of conserved residues within both this amphipathic helix of EMC7 and the methionine-rich loops of EMC3 is important, because their mutation to leucine, but not alanine or glutamate, supported WT levels of EMC function in cells (Fig. 3 C and Fig. S4, A–F). However, for these loops to be directly involved in TA protein capture, they must be capable of physically interacting with substrate TMDs. Indeed, using zero-length disulfide crosslinking, we found that the cytosolic loops of EMC3 and 7 specifically interact with substrates in a TMD-dependent manner (Fig. 3, B and D; and Fig. S4, G and H).
We concluded that the primary role of these flexible loops is to position hydrophobic residues within the cytosol, which physically capture substrate TMDs for subsequent insertion into the membrane. To test whether TA capture in the cytosol could contribute to substrate selectivity by the EMC, we used site-specific crosslinking to compare the interaction of the TMD of WT and mutant SQS, containing a positively charged C-terminus, with the loops of EMC3. We observed only a modest decrease in cytosolic capture of the positively charged SQS mutant (Fig. S4 I), suggesting that capture by EMC3 and 7 did not substantially contribute to substrate discrimination based on C-terminal charge. We therefore turned to consideration of the intramembrane surfaces of the hydrophilic vestibule.
Substrates must passage through a positively charged hydrophilic vestibule for insertion
The improved atomic model of the EMC enabled detailed structure-function analysis of the biophysical requirements of the hydrophilic vestibule for TA protein insertion. The defining characteristic of the hydrophilic vestibule is a network of conserved polar and positively charged residues within the cytosolic leaflet of the lipid bilayer. Previous analysis suggests that charged and polar residues required for EMC function are positioned within the TMDs of the core insertase subunits EMC3 and 6 (Pleiner et al., 2020). Mutations to the positively charged residues in EMC3 strongly impaired insertion in cells, whereas mutations to EMC6 had only mild effects.
A more complete understanding of the localization of EMC4, 7, and 10 allowed us to systematically introduce mutations to all of the polar residues that face the EMC3/6 insertase core (Fig. 3 E). However, we found that mutations to polar, charged, and methionine residues within EMC4’s TMDs had little to no effect on TA protein biogenesis (Fig. 3 E and Fig. S5, A–C). Only mutations of residues that likely affect TMD packing (N140) or lipid headgroup interaction (K67) showed significant phenotypes. If EMC4 does not directly contribute to function, it may instead be playing a role in regulating access to the hydrophilic vestibule, as deletion of its cytosolic EMC2-binding site strongly impaired SQS biogenesis (Fig. S5, D and E). Of all the polar intramembrane residues tested within the hydrophilic vestibule, the highly conserved R31 and R180 of EMC3 are the most crucial for TA protein insertion, and their combined mutation displayed an additive effect on substrate biogenesis (Fig. 3 E; and Fig. S5, F and G).
Positively charged soluble domains impede insertion by the EMC
Both these mutational data and our crosslinking results together suggest that substrates must passage into the membrane directly along a positively charged surface of EMC3. Mislocalization of a mitochondrial TA protein into the ER requires both insertion of its TMD and translocation of its associated positively charged C-terminal domain. Thus, we reasoned that the positively charged hydrophilic vestibule is ideally positioned to discriminate mitochondrial and ER TA proteins through charge repulsion (Fig. 4 A).
To test the fundamental premise of this hypothesis, we first characterized the impact of charge on insertion by the EMC. In order to directly query the role of C-terminal charge, without confounding effects from comparing different substrates or TMDs, we generated a series of mutants of the canonical ER TA protein, SQS, containing increasing amounts of positive charge within its soluble C-terminal domain (Fig. 4 B). Using the split GFP reporter system, we found that while all SQS mutants inserted into the ER in an EMC-dependent manner, insertion efficiency was inversely correlated with positive charge (Fig. 4, C and D). Even addition of a single positive charge to the C-terminus of SQS resulted in a dramatic decrease in integration into the ER. Validating that this effect is specifically occurring at the insertion step and cannot be explained by other effects in cells (e.g., substrate stability), we observed a similar trend between charge and insertion into ER microsomes in vitro (Fig. 4 E).
In addition to its role in TA protein insertion, the EMC co-translationally inserts the first Nexo TMD (N-terminus facing the ER lumen) of many G protein-coupled receptors (GPCRs) that do not contain signal sequences (Chitwood et al., 2018). Like the C-termini of ER TA proteins, these GPCRs contain N-termini that are typically short, unstructured, and net negatively charged (Fig. 5 A; Wallin and von Heijne, 1995). Using the EMC-dependent GPCR Opioid Receptor Kappa 1 (OPRK1), we found that introduction of positive charge is again inversely correlated with insertion propensity by the EMC (Fig. 5, B and C). We therefore propose that inefficient translocation of positively charged extracellular domains is an inherent property of the EMC shared by both its co- and post-translational insertase function.
The EMC selectivity filter enforces TA protein sorting fidelity and the positive-inside rule
The EMC’s strong bias against translocation of positively charged domains provides a biochemical explanation for discrimination of mitochondrial TA proteins at the ER. To determine if this selectivity is due at least in part to charge repulsion between the hydrophilic vestibule of the EMC and the soluble C-terminal domain of a substrate TA protein, we tested whether manipulation of the electrostatic potential of the EMC could alter substrate selectivity.
Due to the prominent location of R31 and R180 of EMC3 at the cytosolic entrance to the hydrophilic vestibule, these residues are ideally positioned to form a charge barrier that selectively prevents translocation across the lipid bilayer. If true, mutations that alter the electrostatic potential of these residues could alleviate repulsion between the EMC and positively charged soluble domains, allowing increased misinsertion of mitochondrial TA proteins. Mutation of both EMC3 R31 and R180 to alanine or glutamate did not affect EMC assembly, and as expected markedly impaired insertion of SQS in cells using our ratiometric fluorescent reporter system (Fig. 6, A and B; and Fig. S5 H). However, SQS variants containing increasingly positively charged C-termini showed increased insertion by the glutamate, but not the alanine mutant EMC. A similar trend was observed for insertion of SQS variants in vitro into WT, alanine, or glutamate mutant ER microsomes, validating that charge specifically affects insertion propensity (Fig. 6 C). Similarly, these EMC3 mutations differentially affected the insertion of the co-translational substrate OPRK1 and its positively charged N-terminal domain mutants in cells (Fig. 6 D).
Because these SQS variants serve only as a proxy for the effects of charge on insertion, we tested whether manipulation of the EMC selectivity filter could also affect mislocalization of bona fide mitochondrial TA proteins into the ER. Indeed, we found that multiple mitochondrial TA proteins, most notably RHOT1, showed increased ER insertion upon expression of the glutamate, but not the alanine mutant of EMC3 in cells and in vitro (Fig. 7, A and B). Fis1, MAOA, and MAOB similarly showed increased ER insertion. Even with increased mistargeting of TA proteins to the ER, induced by depletion of the outer mitochondrial membrane insertase MTCH2 (Guna et al., 2022a), the selectivity filter at the EMC limited mitochondrial TA protein mislocalization to the ER (Fig. S5 I).
Based on this strong preference by the EMC against translocation of positively charged domains, we next tested whether charge repulsion could be used by the EMC to more broadly enforce the correct topology of multipass membrane proteins. Earlier work suggests the EMC assesses the topology-defining signal anchor of nascent membrane proteins after ER targeting and handover from the signal recognition particle (SRP; Chitwood et al., 2018). The N-terminal domains of type II (Ncyt) multipass proteins face the cytosol when inserted in the correct topology and are enriched for positive charge. We postulated that the positively charged selectivity filter of the EMC would therefore reject such TMDs. To test this directly, we analyzed the extent of Nexo misinsertion of the GFP11-tagged Ncyt model protein TRAM2 in the presence of the EMC3 selectivity filter mutations. Indeed, the negatively charged glutamate, but not alanine mutant increased insertion of TRAM2 in the incorrect Nexo topology (Fig. 7, C and D; and Fig. S5 J). This misinserted population is subject to ER-associated degradation because it can be stabilized by the p97 inhibitor CD-5083 (Fig. 7 D). We therefore concluded that the EMC selectivity filter additionally limits misinsertion of multipass proteins in the incorrect topology and thus contributes to enforcing the “positive-inside” rule (Heijne, 1986).
These results suggest that charge repulsion at the EMC provides a selectivity filter to control the subcellular localization of TA proteins (Fig. 7 E), enforcing their accurate sorting between the ER and mitochondrial outer membrane. The enrichment of positive charge in the C-termini of mitochondrial (and likely peroxisomal) TA proteins, serves as a flag for discrimination at the ER by the EMC. Unlike their TMDs, which must mediate function and targeting, the C-terminal domains of most TA proteins are functionally dispensable and may have evolved primarily to facilitate sorting specificity. The combined evolution of mitochondrial TA protein’s positively charged C-termini and the positively charged hydrophilic vestibule of the EMC thereby limits misinsertion of TA proteins at the ER membrane.
The molecular basis for TA protein discrimination was revealed by a systematic analysis of substrate insertion in vitro and in cells that defines the path through the hydrophilic vestibule of the EMC into the membrane. After delivery to the ER by a cytosolic chaperone, the first step in substrate insertion is handover and capture by the EMC. We found that substrate TMDs physically interact with the conserved, hydrophobic loops of EMC3 and EMC7 located immediately beneath the vestibule in the cytosol. Mutational analysis suggests that only the hydrophobicity of these loops, but not their specific amino acid sequence, is important for TA protein insertion. Indeed, comparison of EMC3 with its bacterial and archaeal homologs suggests that methionine-rich cytosolic loops are a conserved feature of Oxa1 superfamily insertases (Borowska et al., 2015), but the specific positioning of these hydrophobic residues is not strictly critical. We propose that these hydrophobic loops represent the first transient, flexible interaction site for substrate TMDs by the EMC.
We observed that substrates crosslink more efficiently to both these loops and the cytosol-exposed residues of the hydrophilic vestibule than to residues within the lipid bilayer. This difference was especially pronounced in native insertion-competent membranes, more likely to represent on-pathway intermediates that are not artifacts of detergent solubilization. These data would be consistent with a longer dwell time of substrates in this cytosolic intermediate followed by faster partitioning into the lipid bilayer. Similarly, a recent kinetic analysis of the bacterial insertase YidC suggests rapid substrate capture via its cytosolic loops and substantially slower translocation of the polar domain and membrane insertion (Laskowski et al., 2021). A plausible explanation for this observation might be that translocation of a polar domain across the hydrophobic lipid bilayer has a high energetic barrier and thus is a rate-limiting step to insertion.
This would be consistent with molecular dynamics simulations that suggest that TMD partitioning into the membrane is an energetically favorable process and membrane protein insertases are primarily required to decrease the energetic barrier for translocation of a soluble domain across the bilayer (Nicolaus et al., 2021; White and Wimley, 1999). Therefore, interaction of a substrate TMD with EMC’s cytosolic hydrophobic loops could prevent aggregation, while its C-terminus probes the hydrophilic vestibule. For correctly targeted TA proteins, the EMC’s hydrophilic vestibule serves as a funnel that catalyzes translocation of their C-termini into the ER lumen by providing a hydrophobicity gradient between the aqueous cytosol and the core of the bilayer. Positioning of similar hydrophilic grooves or vestibules within a locally thinned membrane is a common feature of evolutionary distinct protein translocases (Kumazaki et al., 2014; Voorhees et al., 2014; McDowell et al., 2020; Wu et al., 2020), and represents a striking example of convergent evolution. In the case of the EMC, the dynamic TMDs of EMC4, 7, and 10 provide a protected environment, devoid of any potential off-pathway interaction partners, for the nascent protein to sample the bilayer.
However, for mistargeted mitochondrial or peroxisomal TA proteins, the positive net charge of the hydrophilic vestibule would impose a kinetic barrier to translocation of their positively charged C-terminal domains. In these TA proteins, positive charges are frequently found clustered near their TMD, suggesting that simple net charge alone, may not determine the extent of charge repulsion at the EMC. Repulsion likely delays translocation and thus increases the chance of TA protein dissociation from the hydrophobic loops. Using purified components, we previously showed that the cytosolic domain of the EMC does not contain an ordered high-affinity TMD binding site (Pleiner et al., 2021), as can be found in Get3 or SRP (Guna and Hegde, 2018). A composite transient TMD capture surface formed by flexible hydrophobic loops might allow for faster dissociation of TA protein clients and thus enable quicker accept/reject decisions. Rejected TA proteins in the cytosol could then be either recaptured for targeting to the correct organelle or triaged for degradation by quality control machinery. In this way, the EMC provides an additional layer of specificity to the accurate sorting of the ∼600 TA proteins that must be expressed and localized in human cells.
The degree to which mitochondrial TA protein misinsertion into the ER is affected by the EMC selectivity filter is variable and likely influenced by multiple factors. For example, the inherent propensity for mistargeting to the ER differs among mitochondrial TA proteins (Guna et al., 2022b). Additionally, detailed sequence features of a TA protein’s C-terminal domain (i.e., total charge, charge density/positioning, secondary structure propensity) or TMD itself (i.e., helical propensity, length, hydrophobicity) might alter the effect of the EMC selectivity filter. The rules that determine the dependency of an individual TA protein on the selectivity filter represent an important question for future work.
The two positively charged residues in EMC3, which provide the charge barrier for entrance to the hydrophilic vestibule, are universally conserved in all Oxa1 superfamily insertases. As a result, its homologs, including GET1 and YidC, have also been suggested to inefficiently translocate positively charged soluble domains (Rao et al., 2016; Soman et al., 2014). Indeed, the effect of charge on insertion efficiency appears to be an inherent quality of the EMC and affects both its post- and co-translational substrates. Similar to EMC’s TA protein substrates, GPCRs that lack an N-terminal signal sequence and are therefore potential EMC clients, typically contain neutral or negatively charged N-terminal extracellular domains (Fig. 5 A; Wallin and von Heijne, 1995). Using the same strategy for discrimination of mitochondrial TA proteins, the EMC also enforces the positive-inside rule (Heijne, 1986) for a subset of co-translational multipass substrates that meet its general client criteria (i.e., those without signal sequence containing a short and unstructured N-terminal domain). For Ncyt multipass clients, the EMC selectivity filter imposes correct topology by limiting translocation of their typically positively charged N-terminal cytosolic domains into the ER lumen using charge repulsion. The resulting longer dwell times at the EMC for Ncyt clients then likely triggers transfer to Sec61 for insertion in the correct topology.
Given that signal-sequence-containing proteins are delivered to the ER membrane via the same route as multipass membrane proteins, it is likely that signal sequences also transiently sample EMC’s hydrophilic vestibule. Their frequently positively charged N-terminal region could mediate their rejection by the EMC selectivity filter and thus trigger handover to Sec61 for insertion in the correct Ncyt topology, required for signal sequence cleavage. In this model, the biophysical properties of the N-terminal region would dictate the extent of charge repulsion at the EMC and therefore modulate signal sequence topogenesis. We thus propose that the EMC might contribute to the previously observed Nexo misinsertion of signal-sequence-containing proteins that makes them substrates of corrective quality control pathways (McKenna et al., 2022). By extension, the selectivity filter in the EMC would play a further role in enforcing the correct topology of secreted proteins, along with TAs and multipass membrane proteins.
In summary, we have characterized the molecular logic for how the EMC contributes to selective membrane protein localization in human cells. Its function is analogous to the active role Sec61 plays in substrate selection and rejection at the ER (Trueman et al., 2012). Whether MTCH1 and 2 also confer similar contributions to substrate selectivity at the mitochondrial outer membrane is an important question for future research. However, specificity at the membrane is only one layer of the multi-faceted approach used to regulate protein sorting. Cells employ a sieved strategy in which the overall fidelity of protein localization is the combined result of selectivity at each biogenesis step including chaperone binding in the cytosol, insertion at the membrane, and extraction of misinserted substrates (Rao et al., 2016). How specificity is imparted during the targeting and extraction steps is an area that warrants further study. Particularly in metazoans, where membrane protein mislocalization can lead to disease (Juszkiewicz and Hegde, 2018), these steps are tightly coupled to quality control machinery that ensures immediate recognition and degradation of failed intermediates. By limiting misinsertion of TA proteins and preventing topological errors in multipass membrane proteins, the EMC serves as a guardian for protein biogenesis at the ER.
Materials and methods
Plasmids and antibodies
Constructs for in vitro translations in rabbit reticulocyte lysate were based on the pSP64 vector (Promega). Constructs for in vitro translation in the Escherichia coli PURExpress system were generated from the T7 PURExpress plasmid (New England Biolabs). pSpCas9(BB)-2A-Puro (PX459) and lentiCRISPR v2 were gifts from Feng Zhang (plasmids #48139 and #52961; Addgene). pLG1-puro non-targeting single guide RNA (sgRNA) 3, used for cloning CRISPRi sgRNAs, was a gift from Jacob Corn (plasmid #109003; Addgene). The second-generation lenti-viral packaging plasmid psPAX2 (plasmid #12260; Addgene) and envelope plasmid pMD2.G (plasmid #12259; Addgene) were gifts from Didier Trono. The pHAGE2 lenti-viral transfer plasmid was a gift of Magnus A. Hoffmann and Pamela Bjorkman (California Institute of Technology, Pasadena, CA, USA). For expression in K562 cells, a lenti-viral backbone containing a UCOE-EF-1α promoter and a 3′ WPRE element was used (plasmid #135448; Addgene), which was a kind gift of Martin Kampmann and Jonathan Weissman. The expression plasmid for the SENPEuB protease (plasmid #149333; Addgene) was a gift of Dirk Görlich. Plasmids for amber suppression in mammalian cells were kind gifts of Simon Elsässer (Karolinska Institutet, Stockholm, Sweden). Note that the mCherry variant of RFP was used throughout this study, but the simpler nomenclature of RFP is used in the text and figures. Similarly, EGFP is used throughout this study, but referred to as GFP.
The following antibodies were used in this study: rabbit polyclonal anti-EMC2 (25443-1-AP; Proteintech); mouse polyclonal anti-EMC3 (67205-1-Ig; Proteintech); rabbit polyclonal anti-EMC4 (27708-1-AP; Proteintech); rabbit polyclonal anti-EMC5 (A305-833; Bethyl Laboratories); rabbit polyclonal anti-EMC7 (27550-1-AP; Proteintech); rabbit monoclonal anti-EMC10 (ab180148; Abcam); rabbit polyclonal anti-GET2 (#359 002; Synaptic Systems); mouse monoclonal anti-HA-HRP (H6533; Millipore-Sigma); mouse monoclonal anti-FLAG M2-HRP (A8592; Millipore-Sigma). The rabbit polyclonal antibodies against BAG6 and GFP were gifts from Ramanujan Hegde (Chakrabarti and Hegde, 2009; Mariappan et al., 2010). Secondary antibodies used for Western blotting were Goat anti-mouse- and anti-rabbit-HRP (#172–1011 and #170–6515; Bio-Rad). The chemiluminescent substrates used were SuperSignal West Pico PLUS and SuperSignal West Femto Maximum Sensitivity (34580 and 34096; Thermo Fisher Scientific). The signal was detected on Blue Devil Autoradiography Film (#30–101; Genesee Scientific).
The following sgRNAs were cloned into PX459 or lentiCRISPR v2 and used to generate knockout cell lines: EMC3 (5′-AAGAAAGTGATGATAACGAT-3′); EMC4 (5′-TCATACACACCATCATAGTA-3′); EMC6 (5′-GCCGCCTCGCTGATGAACGG-3′); EMC7 (5′-TTCTCCGTCTACCAGCACTC-3′); EMC10 (5′-AGTGCCAACTTCCGGAAGCG-3′). The following sgRNAs were cloned into pLG1 for CRISPRi knockdowns: non-targeting control (5′-GGCTCGGTCCCGCGTCGTCG-3′); EMC2 (5′-GCCATCTTCCCAGAACCTAG-3′); GET2 (5′-ATGTTGGCCGCCGCTGCGA-3′); MTCH2 (5′-GACGGAGCCACCAAGCGACC-3′).
The following siRNAs were used in this study: negative control no. 2 siRNA (#4390846) and EMC5 siRNA s41131 (both Silencer Select; Thermo Fisher Scientific)
Expression and purification of biotinylated anti-GFP and anti-ALFA nanobody
Protease-cleavable biotinylated anti-GFP and anti-ALFA tag nanobodies (Götzke et al., 2019; Kirchhofer et al., 2010) that were used for EMC purifications throughout this study were expressed in E. coli and purified using Ni2+-chelate affinity chromatography using protocols described in detail before (Pleiner et al., 2015, 2020; Stevens et al., 2023 Preprint). The expression of His14-Avi-SUMOEu1-anti GFP nanobody from plasmid pTP396 (#149336; Addgene) was carried out with the following modification. Instead of biotinylating the nanobody in vitro with purified biotin ligase BirA, pTP396 was expressed in the E. coli strain AVB101 (Avidity), which contains an IPTG-inducible plasmid for BirA co-expression. 50 µM biotin was added to the main culture 1 h before induction of nanobody and BirA expression.
The sequence of the ALFAST nanobody was derived from the original study describing its generation (Götzke et al., 2019) and cloned into pTP396. Expression was carried out in E. coli Rosetta-gami 2 cells (Millipore-Sigma) in a 1 liter scale for 6 h at 18°C after induction of protein expression with 0.2 mM IPTG. The resulting His14-Avi-SUMOEu1-anti ALFA nanobody fusion protein was purified from cell lysate using Ni2+-chelate affinity chromatography for in vitro biotinylation with purified biotin ligase BirA as described before (Pleiner et al., 2020).
Immobilized biotinylated nanobodies were cleaved off of streptavidin magnetic beads using an engineered SUMO protease (SENPEuB) that recognizes the SUMOEu1 module (Vera Rodriguez et al., 2019). His14-Tev-tagged SENPEuB protease (ID #149333; Addgene) was expressed in E. coli NEB express Iq as described before (Pleiner et al., 2020). For sequential IPs, a commercial system with orthogonal cleavage sites based on the SUMOStar tag and SUMOStar protease (LifeSensors; Liu et al., 2008) was used.
Conjugation of ALFA nanobody to HRP for Western blotting
To use the ALFA nanobody in Western blotting, it was coupled to HRP-maleimide via a single engineered C-terminal cysteine residue as described for other nanobodies before (Pleiner et al., 2018).
Mammalian in vitro translation
In vitro translation reactions in rabbit reticulocyte lysate (RRL) were carried out with in vitro transcribed mRNA as described before (Sharma et al., 2010). PCR products generated from pSP64-derived plasmids or gene fragments (synthesized by Integrated DNA Technologies or Twist Biosciences) served as templates for run-off transcription and contained a 5′ SP6 promoter followed by an open-reading frame and a 3′ stop codon. A 10 µl transcription reaction contained 7.6 µl T1 mix (Sharma et al., 2010), 0.2 µl SP6 polymerase (New England Biolabs), 0.2 µl RNAsin (Promega), 100 ng PCR product, and was carried out for 1.5 h at 37°C. Transcriptions were added directly to RRL. Unless indicated otherwise, RRL was treated with S7 micrococcal nuclease (Roche) in the presence of CaCl2 to remove endogenous hemoglobin mRNA. Nascent proteins are labeled during translation reactions of 15–30 min at 32°C in RRL by incorporation of radioactive 35S-methionine (Perkin Elmer). Nascent TA proteins were released from the ribosome with 1 mM puromycin and then incubated with 5% (vol/vol) of either canine pancreatic rough microsomes (cRMs; Walter and Blobel, 1983) or human ER-derived microsomes (hRMs), prepared from engineered cell lines as described below, for another 20 min at 32°C. Samples were analyzed by SDS-PAGE and autoradiography to detect the translated 35S-labeled TA protein.
Successful post-translational insertion into microsomes was monitored by glycosylation of a canonical NXS/T acceptor motif. This was appended either as part of a charged C-terminal Opsin tag (MNGTEGPNFYVPFSNKTVD), or where no additional C-terminal domain charge was desired, an NGT motif was placed 22 amino acids downstream of the TMD after a neutral glycine-serine linker and followed by an additional C-terminal GS dipeptide.
Protease protection assay
To assess the membrane spanning topology of EMC7 and EMC10, they were tagged with an N-terminal 1xHA and a C-terminal 3xFLAG tag and translated in RRL in the presence of cRMs as described above. Protease-accessible regions of both proteins were digested by incubation with 0.5 mg/ml Proteinase K for 1 h at 4°C in the presence or absence of 0.05% (vol/vol) Triton X-100 to solubilize cRM membranes. Proteinase K was inactivated by addition of 5 mM PMSF and quick transfer into boiling SDS buffer (100 mM Tris/HCl, pH 8.4; 1% [wt/vol] SDS). Denatured digestion reactions were diluted tenfold with IP buffer (50 mM HEPES/KOH, pH 7.5; 300 mM NaCl; 0.5% [vol/vol] Triton X-100) and incubated with anti-HA or anti-FLAG M2 resin (Millipore-Sigma) for 1 h at 4°C for IP of protected fragments.
Preparation of hRMs
To prepare hRMs from Expi293 suspension cell lines, cells were harvested and then washed twice in 50 ml 1× PBS. Cells were then resuspended in 4× pellet volume of sucrose buffer (10 mM HEPES/KOH, pH 7.5.; 2 mM MgAc; 250 mM Sucrose, 1× Protease inhibitor cocktail [Roche]) and lysed with ∼50 strokes in a tight-fit dounce homogenizer. Complete cell lysis was verified by trypan blue staining. The lysate was then diluted twofold and spun for 30 min at 3,214 g in a tabletop centrifuge at 4°C to remove nuclei and cell debris. This spin was repeated and the resulting supernatant was then centrifuged for 1 h at 75,000 g at 4°C (TLA-100.3 rotor or Type60 Ti rotor; Beckman Coulter). The supernatant was aspirated and the membrane pellet gently resuspended in microsome buffer (10 mM HEPES/KOH, pH 7.5; 1 mM MgAc; 250 mM Sucrose, 0.5 mM DTT). Membranes prepared for disulfide crosslinking were resuspended in microsome buffer without DTT. The absorbance at 280 nm of the resuspended membranes was measured by boiling an aliquot in SDS buffer (100 mM Tris/HCl, pH 8.4; 1% [wt/vol] SDS). The hRM preparation was then adjusted to an absorbance of 75 at 280 nm using microsome buffer. To remove endogenous mRNAs, the adjusted hRM preps were further treated with S7 micrococcal nuclease (Roche) at a concentration of 0.075 U/µl in the presence of 0.33 mM CaCl2 for 6 min in a 25°C water bath, then quickly removed to ice and quenched by Ca2+-chelation with 0.66 mM EGTA. Nucleased hRMs were snap-frozen in liquid nitrogen in single-use aliquots and stored until further use at −80°C.
In vitro translation of TA proteins in the PURExpress system
Plasmids containing a 5′ T7 promoter, followed by an open-reading frame, stop codon, and 3′ T7 terminator were used as templates for the coupled in vitro transcription/translation PURExpress system (New England Biolabs). The various SQS constructs used for cysteine crosslinking comprised an N-terminal 3xFLAG tag, the human Sec61β cytosolic linker (residues 2–59) with the natural cysteine at position 39 mutated to serine, as well as the five N-terminal flanking residues, TMD, and complete C-terminus of human FDT1/SQS (residues 378–end). Cysteine residues were introduced at the indicated positions using site-directed mutagenesis. TA protein translations were supplemented with radioactive 35S-methionine and 10 µM purified Calmodulin (CaM; Shao et al., 2017).
For use in photocrosslinking reactions, TA protein substrates were generated that contained the unnatural amino acid and photocrosslinker BpA (Bachem), which was incorporated into the TMD by amber stop codon suppression in the PURExpress system lacking all release factors (ΔRF123; New England Biolabs). The release factors RF2 and RF3, but not RF1 (which recognizes the UAG [amber] stop codon), were added back to the reaction. BpA was added at 100 µM and incorporated at UAG codons using purified BpA aminoacyl-tRNA synthetase and suppressor tRNA, prepared as described before (Shao et al., 2017).
All PURE translation reactions were carried out for 2 h at 32°C and then ribosome-associated nascent chains were released by addition of 1 mM puromycin (Thermo Fisher Scientific) and further incubation for 10 min at 32°C. To remove aggregated protein, the translation reactions were layered over a 20% (wt/vol) sucrose cushion prepared in physiological salt buffer (50 mM HEPES/KOH, pH 7.5; 130 mM KAc, 2 mM MgAc) that further contained 100 nM CaCl2. After a 1 h spin at 55,000 rpm (TLS-55 rotor; Beckman-Coulter) at 4°C, soluble TA protein–CaM complexes were retrieved from the supernatant.
Purified EMC complexes in detergent micelles for photocrosslinking were obtained via anti-GFP nanobody IP from stable human suspension cell lines that ectopically express GFP-EMC2. They were mixed with 35S-Methionine labeled BpA-containing TA protein–CaM complexes generated in the PURExpress system as described above. TA proteins were released from CaM shortly before UV radiation by addition of 1 mM EGTA to chelate calcium. Except for the −UV control sample, all reactions were irradiated at a distance of ∼7–10 cm with a UVP B-100 series lamp (Analytik Jena, Germany) for 15 min on ice before quenching with SDS-PAGE sample buffer. Samples were adjusted to 1% (wt/vol) SDS and boiled. Denatured reactions were diluted 10-fold with IP buffer (50 mM HEPES/KOH, pH 7.5; 300 mM NaCl; 0.5% [vol/vol] Triton X-100) and incubated with Protein A sepharose beads (Thermo Fisher Scientific) and EMC3 or EMC4 antibodies for IP. Samples were analyzed by SDS-PAGE and autoradiography.
Site-specific incorporation of the photocrosslinking amino acid 3′-azibutyl-N-carbamoyl-lysine (AbK) into EMC3 in mammalian cells was performed by amber suppression using the Methanosarcina mazei pyrrolysyl-tRNA synthetase (PylRS)/tRNAPylCUA (PylT) pair (Ai et al., 2011). Constructs for amber suppression in mammalian cells were created as follows using previously reported plasmids as template (Elsässer et al., 2016). The first plasmid encodes four copies of PylT(U25C), as well as WT PylRS, which was further modified by mutating Y306A and Y384F to accommodate the bulky AbK (Yanagisawa et al., 2008; O' Donnell et al., 2020). The coding region of EMC3 was inserted with a C-terminal GFP-tag into a second plasmid, which also encoded four additional copies of PylT(U25C). Selected amino acid positions in EMC3 were mutated to amber stop codons, for incorporation of AbK at these sites. To generate AbK-containing EMC, Expi293 cells (Thermo Fisher Scientific) were transiently co-transfected with 4xPylT/PylRS(Y306A, Y384F) and 4xPylT/EMC3(Amber[TAG])-GFP plasmids at a ratio of 4:1 using PEI “MAX” (Polysciences). The cells were grown in the presence of 0.5 mM AbK (Iris Biotech) and harvested 72 h after transfection. EMC complexes with successfully suppressed Amber stop codons contained full length AbK-modified EMC3 and could thus be purified via the C-terminal GFP-tag as described below. The purified EMC complexes were mixed with 35S-Methionine labeled SQS(WT)–CaM complexes generated in the PURExpress system and irradiated with UV as described above. Samples were analyzed by SDS-PAGE and autoradiography.
EMC complexes containing WT or cysteine mutant EMC3 or EMC7 variants were purified from stable human suspension cell lines as described below and mixed with WT or cysteine mutant SQS–CaM complexes generated in the PURExpress system as described above. The zero-length disulfide crosslinker 4,4′-Dipyridyldisulfide (DPS; Millipore-Sigma) was added to a final concentration of 250 µM to initiate the crosslinking of cysteines in close proximity after SQS release from CaM with 1 mM EGTA. The reaction was incubated for 2 h on ice and analyzed by SDS-PAGE and autoradiography.
For disulfide crosslinking in membranes, hRMs were prepared from stable human suspension cell lines expressing WT or cysteine mutant EMC3 variants as described above. hRMs were mixed with PURE translated SQS–CaM complexes in physiological salt buffer and 500 µM DPS. After substrate release with 500 µM EGTA, reactions were incubated for 2 h on ice before quenching with 5 mM L-Cysteine (Millipore-Sigma). The reactions were then adjusted to 1% (wt/vol) SDS and incubated at room temperature for 10 min to denature the EMC complex. The denatured reactions were diluted tenfold with IP buffer (50 mM HEPES/KOH, pH 7.5; 300 mM NaCl; 0.5% [vol/vol] Triton X-100) and the EMC3-GFP subunit was specifically enriched via anti-GFP nanobody IP. After elution by boiling in sample buffer containing 0.5 M urea, the samples were analyzed by SDS-PAGE and autoradiography.
Cell culture and cell line generation
Adherent HEK293 cell lines were cultured in DMEM supplemented with 10% FCS and 2 mM L-Glutamine. For Flp-In T-Rex 293 cell lines containing integrated doxycycline-inducible reporters, tetracycline-free FCS was used and culture medium additionally supplemented with 15 µg/ml blasticidin S and 100 µg/ml hygromycin B. RPE1 cells were cultured in DMEM/F-12 (1:1) supplemented with 10% FCS and 2 mM L-Glutamine.
Flp-In 293 T-Rex cells were purchased from Thermo Fisher Scientific. Stable Flp-In 293 T-Rex cell lines designated as GFP-2A-RFP-SQS/vesicle-associated membrane protein 2 (VAMP2) express the RFP-tagged transmembrane domain and flanking regions of human SQS (SQS/FDFT1) or VAMP2. The generation of these cell lines was described previously (Guna et al., 2018; Pleiner et al., 2020). In these cell lines, GFP is expressed as a soluble cytosolic protein from the same mRNA as RFP-SQS/VAMP2 using a viral 2A sequence that induces peptide-bond skipping by the ribosome (de Felipe et al., 2006). Their RFP and GFP fluorescence intensity can be measured by flow cytometry to derive an RFP:GFP ratio. Changes in this ratio after perturbation, e.g., expression of a mutant EMC subunit, reflect differences in the post-translational stability of the TA protein reporter.
The stable, doxycycline-inducible GFP-EMC2 Flp-In 293 T-Rex cell line and its adaptation to suspension growth in FreeStyle 293 Expression Medium (Thermo Fisher Scientific) was described before (Pleiner et al., 2020). Clonal knockouts of EMC4, 7, and 10 in this background were obtained by transfecting the adherent parental cell line with PX459 encoding the respective sgRNA using TransIT-293 transfection reagent (Mirus). 48 h after transfection, 1 µg/ml puromycin was added for 3 consecutive days. Medium was subsequently exchanged to allow for 2 d of recovery before single cell clones were seeded into 96-well plates by limiting dilution. Knockout efficiency of the selected clones was verified by Western blotting, and the resulting adherent knockout cell lines were either used directly for flow cytometry experiments or adapted to suspension growth for EMC purifications.
Expi293 cells (Thermo Fisher Scientific) were maintained at a concentration of 0.5–2.0 million cells per ml in Expi293 Expression Medium (Thermo Fisher Scientific). An EMC3 knockdown suspension cell line was generated by transient transfection of Expi293 cells with an EMC3 sgRNA cloned into lentiCRISPR v2 using PEI “MAX” (Polysciences). Transfected cells were treated with 10 µg/ml puromycin for 4 consecutive days. Then the medium was exchanged to allow for 10 d of recovery. The polyclonal cell population demonstrated a sufficient level of consistent downregulation of endogenous EMC3 and was thus used directly to re-introduce WT EMC3 or various mutants tagged with a C-terminal TagBFP or GFP via lenti-viral transduction as described below. Transduced cell lines were sorted using fluorescence of the fused TagBFP or GFP to obtain a homogenous population of cells with near full replacement of endogenous EMC3 with a tagged mutant copy of interest. WT EMC7 or various cysteine mutants with an N-terminal ALFA tag were introduced via lenti-viral transduction into the EMC3-GFP cell line.
A K562 CRISPRi cell line, stably expressing dCas9-BFP-KRAB Tet-ON (Jost et al., 2017), was transduced with lentivirus as described below to constitutively express β-strands 1–10 of superfolder GFP (residues 2–214; Cabantous et al., 2005) in the ER lumen via fusion to an N-terminal signal sequence and a C-terminal KDEL sequence as described previously (Guna et al., 2022b).
K562 dCas9-BFP-KRAB Tet-ON, ER GFP1-10 cells were transduced via spinfection as described below with lentivirus containing a pLG1-puro backbone and an sgRNA targeting a gene of interest. Sequences of sgRNAs were derived from the hCRISPRi-v2 compact library (Horlbeck et al., 2016). 48 h after spinfection, 1 µg/ml puromycin was added for 3 consecutive days to select cells with a successfully integrated sgRNA expression cassette. After 2 d of recovery, cells were transduced with GFP11-tagged TA protein reporters expressed from a lentiviral backbone under control of a UCOE-EF1α promoter. Cells were analyzed 48 h after reporter spinfection by flow cytometry (8 d after sgRNA transduction).
Lentivirus was generated by co-transfection of HEK293T cells with a desired transfer plasmid and two packaging plasmids (psPAX2 and pMD2.G) using the TransIT-293 transfection reagent (Mirus). 48 h after transfection, culture supernatant was harvested, aliquoted, and flash-frozen in liquid nitrogen.
For transduction of Expi293 or suspension-adapted Flp-In 293 T-Rex cells, 20 million cells were mixed with 2.5 ml freshly harvested lenti-viral supernatant (i.e., the complete supernatant from one 6-well of lenti-producing HEK293T cells 48 h after transfection) in 20 ml medium in a 125 ml vented Erlenmeyer flask (Celltreat; Stevens et al., 2023 Preprint). Then the flask was transferred to a shaking incubator and transduced cells were grown for around 16 h. Cells were then pelleted, resuspended in 50 ml of fresh medium, and grown for 2–3 d before sorting of successfully transduced cells on a SH800S cell sorter (Sony Biotechnology).
K562 cells were transduced by spinfection. Briefly, 250,000 cells were mixed with 50–200 µl of lentiviral supernatant and RPMI medium in the presence of 8 µg/ml polybrene in a total volume of 1 ml in a 24-well plate. 24-well plates were spun at 1,000 g for 1.5 h at 30°C. Cells were then resuspended and transferred to a 6-well plate. Lenti-viral reporter constructs used in K562 cells for flow cytometry analysis all contained an upstream UCOE-EF1α promoter, followed by RFP, a P2A site, and the full-length human coding regions for all mitochondrial TA proteins fused to GFP11 via a five residue Gly-Ser linker. SQS mutants were expressed in the same cassette, but contained the cytosolic linker (residues 2–70) of human Sec61β at the N-terminus followed by the TMD, N-terminal flanking region, and complete C-terminus of human FDFT1/SQS (residues 378–417 [end]). Charge mutations were introduced as shown in Fig. 4 B. EMC3 WT or its arginine mutants were expressed in K562 cells from a lentiviral transfer plasmids with an upstream EF1α promoter and fused to a C-terminal TagBFP-3xFLAG tag.
For lenti-viral transduction of adherent HEK293 or RPE1 cells, 50–200 µl lentiviral supernatant and 8 µg/ml polybrene (Millipore-Sigma) were usually added directly to ∼70% confluent cells in 2.5 ml culture medium in a 6-well. Lenti-viral reporter constructs of SQS and VAMP2 for use in HEK293 cells (Fig. 3, C and D; Fig. S3 E; Fig. S4, A, D, and E; and Fig. S5, A, B, and D) contained an upstream CMV promoter, followed by GFP, a 2A site, and RFP, which was directly fused to the TMD and flanking regions of human FDFT1/SQS or VAMP2 as described before (Guna et al., 2018; Pleiner et al., 2020). OPRK1 reporter constructs used in RPE1 cells, expressed full length human OPRK1 (WT/−5), OPRK1(E45K, D46R, E50K; +1 variant) or OPRK1(E35K,D37R,E45K,D46R,E50K; +5 variant) as N-terminal fusions to GFP, followed by a 2A site and RFP from a CMV promoter.
Flow cytometry analysis of reporter cell lines
All adherent cells were trypsinized, washed, and resuspended in 1xPBS for flow cytometry analysis. K562 cells were analyzed directly. Analysis was either on an Attune NxT Flow Cytometer (Thermo Fisher Scientific) or a MACSQuant VYB (Miltenyi Biotec). Flow cytometry data was analyzed using FlowJo v10.8 Software (BD Life Sciences). Unstained cells transiently transfected with either GFP or RFP (or BFP if needed) were analyzed separately along every run as single-color controls for multicolor compensation using the FlowJo software package.
For experiments in K562 cells, lenti-viral fluorescent reporters were introduced via spinfection as described above, usually 48 h before analysis. To probe the effect on EMC2 or GET2 knockdown on reporter insertion, cells were additionally transduced with sgRNA expressing lenti-viral vectors as described under “CRISPRi knockdowns.” To analyze the effect of EMC3 mutations on TA protein reporters, K562 cells were first spinfected with lentivirus expressing EMC3(WT/mut)-BFP. After 48 h, mitochondrial TA protein or SQS charge mutant reporter lentivirus was spinfected. Cells were analyzed by flow cytometry after another 48 h. For experiments with p97 inhibitor (CD-5083 [Selleckchem]), the cells were treated with 1.25 µM inhibitor for the last 6 h before analysis. Adherent HEK293 or RPE1 cells were analyzed 48 h after transduction as described above.
Purification of engineered EMCs from stable suspension cell lines
Stable human suspension cell lines expressing tagged WT or mutant copies of EMC subunits were generated and grown as described above. EMC complexes were purified using anti-GFP or anti-ALFA nanobody essentially as described before (Pleiner et al., 2020; Stevens et al., 2023 Preprint). Cells were harvested by centrifugation for 10 min at 3,000 g and washed in 1xPBS. Cell pellets were resuspended with 6.8 ml solubilization buffer (50 mM HEPES/KOH, pH 7.5; 200 mM NaCl; 2 mM MgAc; 1% [wt/vol] lauryl maltose neopentyl glycol [LMNG; Anatrace], 1 mM DTT, 1× complete EDTA-free protease inhibitor cocktail [Roche]) per 1 g of cell pellet and incubated for 30 min at 4°C. Lysates were cleared by centrifugation for 30 min at 4°C at 18,000 rpm (SS-34 rotor; Beckman-Coulter).
In parallel, Pierce magnetic Streptavidin beads (Thermo Fisher Scientific) were equilibrated in wash buffer (solubilization buffer with 0.0025% [wt/vol] LMNG) and then incubated with biotinylated anti-GFP or anti-ALFA tag nanobody, purified as described above. After nanobody immobilization, free biotin binding sites were blocked by incubation with wash buffer containing 10 µM dPEG24-biotin acid (Quanta Biodesign) for 10 min on ice. Blocked, nanobody-decorated beads were then added to cell lysate for binding to detergent-solubilized ALFA- or GFP-tagged EMC complexes for 1 h at 4°C with head-over-tail mixing. Magnetic beads were then collected and washed three times with wash buffer, before resuspension of the beads in wash buffer containing 250 nM SENPEuB protease in a volume amounting to one half of the original bead suspension volume. Protease elution was allowed to proceed for 20 min on ice. All EMC complexes purified for disulfide crosslinking were eluted in wash buffer without DTT.
EMC complexes containing fully replaced cysteine mutant EMC7 variants, were purified via a two-step procedure using first the C-terminal GFP tag on EMC3 and then the N-terminal ALFA tag on EMC7. The GFP nanobody eluate, obtained by SENPEuB cleavage, was diluted 20-fold with wash buffer and incubated with beads containing immobilized ALFA nanobody. The ALFA nanobody was tagged with an orthogonal SUMOStar protease cleavage site and bound EMC was then eluted along with the ALFA nanobody in wash buffer containing 500 nM SUMOStar protease. The resulting eluate was aliquoted and flash-frozen in liquid nitrogen. The concentrations of purified EMC complexes for disulfide crosslinking were normalized by measuring GFP fluorescence on a BioTek Synergy HTX plate reader (Agilent). Normalization was verified by SDS-PAGE and Sypro Ruby staining (Thermo Fisher Scientific). If necessary, normalizations were adjusted based on the quantification of Sypro Ruby stained EMC subunit bands in Fiji.
Purification of EMC for structure determination
A suspension-adapted GFP-EMC2 Flp-In 293 T-Rex cell line (Pleiner et al., 2020) was used to purify the EMC for structural analysis. Additionally, EMC7 carrying a C-terminal ALFA tag was introduced into this cell line via lenti-viral transduction as described above. The lenti-viral transfer plasmid encoded EMC7-ALFA fused via a viral 2A sequence to BFP (EMC7-ALFA-2A-TagBFP). BFP fluorescence was used to sort a homogenous stable suspension cell line that ectopically expresses both GFP-EMC2 and EMC7-ALFA. EMC was purified as described above, but with the following minor modifications. Cells were solubilized with solubilization buffer containing 1% (wt/vol) glyco-diosgenin (GDN; Anatrace). The wash buffer contained 0.05% (wt/vol) GDN. Finally, the EMC eluate was concentrated using an Amicon Ultra 0.5 ml 100 K MWCO concentrator (Millipore-Sigma) and further purified via size-exclusion chromatography using a Superose 6 Increase 3.2/300 column (Cytiva) equilibrated in wash buffer (50 mM HEPES/KOH, pH 7.5; 200 mM NaCl; 2 mM MgAc; 0.05% [wt/vol] GDN; and 1 mM DTT). Fractions corresponding to the EMC were pooled and concentrated as above to 0.5 mg/ml. To reduce the conformational flexibility of EMC7 at the insertase side, we added stoichiometric amounts of purified ALFA nanobody (Götzke et al., 2019), which binds the C-terminal ALFA tag on EMC7.
Grid preparation and data collection
Cryo-EM grids were prepared by applying 3 μl of purified EMC at 0.5 mg/ml to glow discharged (60 s using a Pelco easiGlow, Emeritech K100X at a plasma current of 20 mA), Holey carbon grids (Quantifoil R1.2/1.3). The sample was blotted for 4–6 s with filter paper at 8°C, 100% humidity, at a −4-blot force prior to plunging into liquid ethane for vitrification using the FEI Vitrobot Mark v4 ×2 (Thermo Fisher Scientific). The data set was acquired on a Titan Krios electron microscope (Thermo Fisher Scientific) operated at 300 keV equipped with a K3 direct electron detector and an energy filter (Gatan) with a 20-eV slit width. A total of 11,822 micrographs were collected using 3-by-3 pattern beam image shift, acquiring movies for three non-overlapping areas per hole, using an automated acquisition pipeline in SerialEM (Mastronarde, 2005). Movies were recorded with 40 frames at a magnification of 105,000× in super-resolution mode at a calibrated magnification of 0.416 Å/pixel using a dose of 60 e−/Å2 at a dose rate of 16.0 e−/pixel/s and a defocus range of −1.0 to −3.0 μm.
The data processing workflow is summarized in Fig. S2 and was performed using cryoSPARC v.3.3–v.4.0 (Punjani et al., 2017). In short, 11,822 micrographs were motion corrected, dose weighted, and down sampled (twofold to 0.832 Å/pixel) using the Patch Motion followed by patch-based contrast transfer function (CTF) estimation using Patch CTF. 10,206 movies were selected and manually curated using cutoffs for CTF fit (5.0 Å) and total motion (50 pixel) for further processing. The particle picking was done using the automated Blob Picker function with particle diameter of 150–400 Å. After two rounds of 2D classification, 1,271,124 particles were used for two rounds of heterogeneous ab initio reconstruction (four volumes), using Maximum/Initial resolution of 9 and 7 Å, respectively, and an Initial/Final minibatch size of 400 and 1,200 particles, respectively. Once we obtained an initial map with clear features of the EMC, we reclassified the 1.2 million particles using 3D heterogeneous classification using one well-defined class of the EMC and three decoy classes, using a batch size of 5,000 particles per class and initial low-pass filter of 50 Å. Prior to the final round of classification, 212,440 particles were re-extracted in a box size of 400 pix. The final round of classification yielded a population of 193,900 particles that were further refined using non-uniform refinement to obtain a reconstruction at 3.5 Å resolution.
To explore the previously observed flexibility between the lumenal, membrane, and cytoplasmic domains, the particles were subjected to two rounds of 3D-variability analysis/clustering, selecting five modes and a filter resolution ranging from 4.0 to 8.0 Å. After carefully analyzing each reconstruction, a mode corresponding to a missing subunit of the EMC was identified. The subset of particles was then split into 20 clusters using 3D Variability Analysis Display for this mode. Particles belonging to the nine-subunit complex (156,706 particles) that contained high-resolution features were combined and refined using non-uniform refinement. This yielded a map with a resolution of 3.6 Å, in which we detected a stronger EM density for the TMDs of EMC4 and 7.
Particles belonging to the eight-subunit complex (37,194 particles) were combined and similarly to the nine-subunit complex, the particles were refined using non-uniform refinement. This yielded a map with a resolution of 3.9 Å. All three maps (consensus, nine-, and eight-subunit) were post-processed by applying a sharpening B factor of −112 Å2, −103 Å2, and −76 Å2, respectively. Finally, for the analysis of EMC10’s TMD position a low-pass filter of 5.5 Å was applied to each map using volume tools in cryoSPARC.
All map resolutions were calculated at the final round of refinement using the gold standard Fourier Shell Correlation (FSC) = 0.143 criterion from the half maps. Statistic details of the EMC EM maps are reported in Table S1.
Model building and refinement
An initial model for the nine-subunit EMC was generated by docking the EMC structure in a lipid nanodisc (PDB: 6WW7; Pleiner et al., 2020) into the cryo-EM density using UCSF Chimera (Pettersen et al., 2004) followed by an initial round of refinement using Phenix (Liebschner et al., 2019). Next, for the not-well-ordered TMDs of EMC4 and 7, high-confidence subcomplexes EMC3 (residues 5–42 and 101–209), EMC4 (59–155), EMC6 (12–end) and EMC7 (155–178) were generated using AlphaFold2-Multimer ColabFold (AlphaFold2_advanced.ipynb; Mirdita et al., 2022) and then rigid body fitted into the densities. Finally, all models were combined and further manual refinement was conducted in COOT (Casañal et al., 2019; Emsley et al., 2010). Next, lipids, N-glycans and disulfide bond pairs were added where justified by both the EM density and its chemical environment. Finally, the final model was refined against the nine-subunit map using phenix.real_space_refine. Although we could successfully model a backbone through the contiguous density of the TMDs of EMC4 and 7, we could not unambiguously assign its registry and therefore these TMDs were assigned as poly-Ala/Gly in the final model. Statistic details of the EMC model are reported in Table S1. Figures were made using PyMol (Schrödinger LLC) and UCSF ChimeraX.
Online supplemental material
Fig. S1 shows crosslinking and cell reporter assay data in support of defining the hydrophilic vestibule as the insertase side of the EMC. Fig. S2 shows an overview of the cryo-EM data processing pipeline. Fig. S3 shows the updated atomic model of the EMC, as well as biochemical data characterizing the peripheral subunits EMC4, 7, and 10. Fig. S4 shows in cell reporter assay and crosslinking data that demonstrate substrate capture by the cytosolic loops of EMC3 and 7. Fig. S5 shows data demonstrating that intramembrane residues in EMC4 do not contribute significantly to TA protein insertion, as well as data highlighting the cooperative effect of mitochondrial insertase MTCH2 and the EMC selectivity filter in mitochondrial TA protein sorting. Table S1 shows cryo-EM data collection, refinement, and validation statistics.
The data reported in this work are available in the published article and its online supplemental material. The atomic coordinates and cryo-EM maps have been deposited and openly available in the Protein Data Bank under accession code PDB 8S9S and in the Electron Microscopy Data Bank under accession codes EMDB-40245 (nine-subunit map), EMDB-40246 (consensus map), and EMDB-40247 (eight-subunit map).
We thank Songye Chen and Oliver Clarke for technical assistance, all members of the Voorhees lab for thoughtful discussion, and Alina Guna for critical reading of the manuscript. We thank Pamela Bjorkman for access to her lab’s cell sorter, as well as the Caltech Flow Cytometry facility and the Caltech Cryo-EM facility.
Cryo-electron microscopy was performed in the Beckman Institute Center for TEM at Caltech, and data was processed using the Caltech High Performance Cluster, supported by a grant from the Gordon and Betty Moore Foundation. This work was supported by: the Heritage Medical Research Institute (R.M. Voorhees), the National Institutes of Health’s National Institute of General Medical Sciences DP2GM137412 (R.M. Voorhees), the Deutsche Forschungsgemeinschaft (T. Pleiner), and the Tianqiao and Chrissy Chen Institute (T. Pleiner, M. Hazu).
Author contributions: T. Pleiner, M. Hazu, G. Pinton Tomaleri, and R.M. Voorhees conceived the study. T. Pleiner, M. Hazu, and G. Pinton Tomaleri performed most of the experiments and analysis with assistance from V.N. Nguyen and K. Januszyk. T. Pleiner, M. Hazu, and R.M. Voorhees wrote the manuscript with input from all authors.
T. Pleiner, M. Hazu, and G. Pinton Tomaleri contributed equally to this paper.
Disclosures: R.M. Voorhees reported personal fees from Gate Biosciences and grants from Gate Biosciences outside the submitted work. R.M. Voorhees and G. Pinton Tomaleri are consultants for Gates Biosciences, and R.M. Voorhees is an equity holder. No other disclosures were reported.
K. Januszyk’s current affiliation is Neomorph, Inc., San Diego, CA, USA.