Salmonella species are among the world’s most prevalent pathogens. Because the cell wall interfaces with the host, we designed a lipidomics approach to reveal pathogen-specific cell wall compounds. Among the molecules differentially expressed between Salmonella Paratyphi and S. Typhi, we focused on lipids that are enriched in S. Typhi, because it causes typhoid fever. We discovered a previously unknown family of trehalose phospholipids, 6,6′-diphosphatidyltrehalose (diPT) and 6-phosphatidyltrehalose (PT). Cardiolipin synthase B (ClsB) is essential for PT and diPT but not for cardiolipin biosynthesis. Chemotyping outperformed clsB homology analysis in evaluating synthesis of diPT. DiPT is restricted to a subset of Gram-negative bacteria: large amounts are produced by S. Typhi, lower amounts by other pathogens, and variable amounts by Escherichia coli strains. DiPT activates Mincle, a macrophage activating receptor that also recognizes mycobacterial cord factor (6,6′-trehalose dimycolate). Thus, Gram-negative bacteria show convergent function with mycobacteria. Overall, we discovered a previously unknown immunostimulant that is selectively expressed among medically important bacterial species.
Introduction
Salmonella enterica enterica Typhi (S. Typhi) is the etiologic agent of typhoid fever. This enteric fever has a high mortality rate if untreated and is responsible for ∼200,000 deaths annually (Mogasale et al., 2014). More generally, diarrheal diseases from Gram-negative bacteria are among the most prevalent and most deadly bacterial infectious diseases in the world, comparable to Mycobacterium tuberculosis, which causes tuberculosis (World Health Organization, 2018). S. Typhi lives in a membrane-bound vacuole, which does not fuse with lysosomes and permits the intracellular replication of the bacteria. Gram-negative bacterial cell walls are composed of an inner membrane dominated by phospholipids, a thin layer of peptidoglycan polymer, and an outer membrane with more complex lipids, which directly interface with the host.
Like other Gram-negative bacteria, Salmonella species synthesize LPS, which resides in the outer leaflet of the outer membrane and strongly stimulates the innate immune system by triggering TLR4 (Poltorak et al., 1998). LPS is among the most studied molecules in infectious diseases (Nature Reviews Immunology, 2011), where it unquestionably controls fever and sepsis as key manifestations of S. Typhi and other Gram-negative bacterial syndromes. However, anti-LPS therapies have had limited success in treating sepsis, and differing LPS chemotypes do not fully explain the markedly different immunogenicity, fever-inducing capacity, and pathogenicity of diverse Gram-negative bacterial species and strains. Further, given the focus on LPS as a strong stimulant for the mammalian immune system, the many other foreign molecules in Gram-negative cell walls have received less attention. To our knowledge, comparative lipidomic analyses of important pathogenic and nonpathogenic Gram-negative species have not been reported, raising the possibility that undiscovered virulence-associated lipids exist.
To test this hypothesis, we recently developed a rapid method that takes advantage of normal-phase chromatography to separate and analyze dozens of classes and thousands of molecular species of bacterial lipids by mass spectrometry (MS; Layre et al., 2011). Combined with manual methods of TLC collision-induced dissociation MS (CID-MS) and nuclear magnetic resonance (NMR) spectroscopy, this system has been proven to offer an approach to discovery of previously unknown or pathogen-specific lipids, such as phosphomycoketides, dideoxymycobactins, and tuberculosinyladenosines (Matsunaga et al., 2004; Madigan et al., 2012; Layre et al., 2014). Using S. Typhi as an example of a major Gram-negative pathogen, we compared its lipids to those of less pathogenic serovars including S. Paratyphi, S. Enteritidis, and S. Typhimurium, generating clear evidence for strain-specific differences in lipid synthesis.
Focusing on the most abundant lipids that are selectively expressed by key pathogenic species, we discovered the products and defined the key genes of a new glycolipid biosynthesis pathway in a subset of pathogenic Gram-negative bacteria. We found that a gene product annotated as a cardiolipin (CL) synthase B (ClsB) functions as the essential enzyme for an abundant family of previously unknown immunogenic trehalose-containing phospholipids. Whereas most phylogenetic analyses and clinical strain typing rely on genetic methods, these studies illustrate the unique information available through systematic analysis of bacterial lipids. Similarities among mycobacterial and Salmonella trehalose-containing lipids suggest functional convergence in their activation of human immune response, highlighting a new pathway to adjuvant development.
Results
Lipidomic analysis of pathogenic Salmonella serovars
We used HPLC-MS–based comparative lipidomics (Layre et al., 2011) to study a major pathogenic S. enterica serovar, S. Typhi, and compare its lipid profile with the less virulent but closely related serovar, S. Paratyphi A. Each molecular species of lipid isolated from the bacteria, or its adduct, is detected as a three-component data point known as a molecular event. A molecular event consists of a retention time on the HPLC column, a mass-to-charge ratio (m/z), and an intensity value. The number of molecular events estimates the total number of lipids present, including chain length variants, altered adducts, and isotopes of each molecule. The two Salmonella serovars combined generated 4,569 molecular events (Fig. 1 a). As expected, molecular diversity is lower than the 6,000–10,000 events in the highly complex cell wall of M. tuberculosis (Layre et al., 2011). However, this number still represents substantial lipid diversity in a Gram-negative pathogen. Comparison of S. Typhi and S. Paratyphi A resulted in 865 lipids that differed in intensity by twofold with a corrected P value of <0.05, documenting substantial divergence of the two lipidomes. This number exceeded our experimental throughput for compound identification, so we designed strategies to prioritize the unknown compounds for identification.
Two previously unidentified, abundant lipids
To focus on lipids of high biological interest, including previously unknown compounds or candidate virulence factors, we prioritized molecular events that (1) were enriched in S. Typhi, (2) had high absolute intensity, and (3) had m/z values that did not match known compounds in LIPID MAPS or other databases (Fahy et al., 2009; Layre and Moody, 2013). Strain-specific enrichment of two abundant lipids was evident even by relatively insensitive normal-phase TLC method (Fig. 1 b). Two spots with retardation factors (Rf) of 0.26 and 0.22 showed much denser spots in S. Typhi as compared with S. Paratyphi A. Positive mode nanoelectrospray ionization (nano-ESI) MS analysis of TLC scrapings of these spots yielded spectra that were dominated by ions of m/z 1,694.2 and 1,029.6, respectively. These were seen along with ions corresponding to chain length and saturation variants that differed by 14 (CH2) or 12 (C) u, including m/z 1,722.2, m/z 1,057.6, and m/z 1,069.5 (Fig. 1 c). Initial discovery efforts emphasized HPLC time of flight (TOF) MS over TLC, because the former is ≈ 106-fold more sensitive. That these two lipids were visible by the less sensitive technique suggests high abundance in S. Typhi (Fig. 1 b). Other lipids identified by the HPLC-lipidomics system are produced at <10 parts per million (ppm) of total cellular lipid (Moody et al., 2000, 2004). The limit of detection of charring on TLC is ∼1 µg, so any clearly visible spot in a profile from 300 µg of total bacterial lipid suggests production in the low parts per hundred range. If the combined density of the two unknowns is one fourth or more of that of the phosphatidylglycerol (PG) standard (40 µg), these two unknown lipids would comprise ∼3.3% of total extractable lipids of S. Typhi (Fig. 1 b).
As a first step to identification, we sought to match low mass accuracy values from TLC-nano-ESI MS (Fig. 1 c) to high mass resolution data from the HPLC-TOF-MS lipidomic data expressed as a volcano plot (Fig. 1 a) or scatter plot (Fig. 1 d). This approach identified ions corresponding to ammoniated adducts ([M+NH4]+) of M in the spectrum from the lipid with Rf 0.22 on TLC (Fig. 1 c). Using the higher mass accuracy data (m/z 1,002.612), we deduced M as likely being C48H89O18P. This suggested that the compound was a phospholipid, consistent with its observed retention time in HPLC-MS (Fig. 1 e; 29.5 min) matching known phosphoglycolipids (Layre et al., 2011). High accuracy mass value and molecular formula allowed searching of lipid databases, which returned no matches, suggesting that the target was previously unknown. Given that S. Typhi is a widely studied pathogen of worldwide significance, finding a previously unidentified compound among the most abundant lipids in this organism was unexpected. These data provided a strong rationale for focused analysis of the compounds of Rf 0.22 and 0.26 to determine their complete structures.
Serovar-specific patterns of expression
Differential analysis of four pathogenic S. enterica serovars (S. Typhi, S. Paratyphi, S. Enteritidis, and S. Typhimurium) using three separately obtained clinical isolates within each serovar, suggested uniform abundance of the two unknown lipids among independent clinical isolates (Fig. 1 b). Looking across the serovars, the two lipids showed the same ranking with S. Typhi > S. Typhimurium ≈ S. Enteritidis > S. Paratyphi. These findings were consistent with the possibility of coregulation of two structurally related lipids in the same pathway. TLC (Fig. 1 b) resulted in barely discernable bands in S. Paratyphi. However, using the established strains S. Typhi Quailes and S. Paratyphi A strain NVGH308, ion chromatograms reproducibly demonstrated a coeluting lipid with a mass of 1,002.612 in S. Paratyphi A, confirming the presence of trace amounts of the unknown in this serovar (Fig. 1 e). Based on ion chromatogram intensity calculated as peak area, S. Paratyphi produces ≈ ninefold less product than S. Typhi. Overall, these data demonstrated serovar-specific synthesis of many lipids in Salmonella and identified two previously unknown phospholipids enriched in S. Typhi.
Discovery of two trehalose phospholipids
Nearly all MS signals from TLC-purified S. Typhi material corresponded to m/z 1,694.2 or m/z 1,029.6 and the identifiable isotopes and chain length variants thereof, suggesting each spot contained only one major product (Fig. 1 b). Initial characterization by ion trap CID-MS tentatively identified the lipids as two structurally related dihexose phospholipids that contained either one or two phosphatidyl groups (Figs. 2 a and S1). The lower band consisted of phosphatidyldihexose and the upper band of diphosphatidyldihexose. One of the fatty acids in the phosphatidyl group was a palmitic acid (C16). The other one was a C17:1, suggesting the presence of either an unsaturation or a cyclopropyl group.
Because CID-MS cannot unequivocally identify the particular hexoses or differentiate an unsaturation from a cyclopropyl group in the C17 fatty acid, we undertook one- and two-dimensional NMR spectroscopy analysis (Fig. 2 b and Dataset 1). This approach showed that the lower migrating unknown compound possesses two anomeric protons (doublets at δ 5.10 and 5.11 ppm) both with a coupling constant (J) of 3.7 Hz, suggesting an α,α-linked dihexose structure (Roslund et al., 2008). The signals between δ 3.3 and 4.1 ppm were complex, necessitating correlation spectroscopy (COSY) analysis, which allowed assignment of the protons at C1–C5, showing two C5 signals at δ 3.82 and δ 3.95 (Dataset 1 b). Two-dimensional total correlation spectroscopy (TOCSY) NMR (Gheysen et al., 2008) showed a one-spin system for the dihexose unit with five cross-peaks and the anomeric center, revealing the C6 signal at δ 4.08 (Dataset 1 c). Distortionless enhancement by polarization transfer heteronuclear single quantum coherence spectroscopy (135DEPT-HSQC; Dataset 1 d) showed clear C6 signals at δ 4.08 + δ 4.02 (with 13C = δ 65.6), and C6′ signals at δ 3.79 + δ 3.67 (with 13C = δ 62.3). Also, an additional C4 signal could be assigned at δ 3.36–71.7. The significant difference of the C6-signals in both the 1H and 13C-NMR resulted from a phosphate moiety at C6, as shown in the 31P-decoupled 1H-NMR (Fig. S2 e) and the CID-MS spectrum (Fig. 2 a). Coupling constants for the dihexose CH units were established to be in the 8- to 10-Hz interval, consistent with J-values for trans stereochemistry. The only hexose with all-trans stereochemistry between the H-atoms is glucose.
Finally, the α,α-trehalose-6-phosphate moiety was unambiguously verified by comparing the 1H- and 13C-NMR signals with that of α,α-trehalose-6-phosphate (Wang and Hollingsworth, 1995). The signals at δ 5.24 (m, 1H) and δ 4.00 (t, J = 5.6, 2H) and the diastereotopic protons at δ 4.45 (dd, J = 3.0, 12.0, 1H) and δ 4.20 (dd, J = 6.9, 12.0, 1H) were part of one spin system in two-dimensional TOCSY, consistent with a glycerol moiety (Pérez-Victoria et al., 2009). 1H-NMR signals with resonances at δ −0.32, 0.60, and 0.68 ppm strongly suggested a cis-cyclopropane unit (Knothe, 2006). NMR spectra from the higher migrating unknown were similar to those described above (Dataset 1, f–i) but were less complex because of reduced peak overlap, consistent with a symmetrical molecule.
The presence, but not the location and absolute stereochemistry, of the cyclopropyl group was established experimentally. Prior analyses of C17 fatty acids in the closely related species Escherichia coli suggest that cyclopropyl groups occur at the 9,10 position (Grogan and Cronan, 1997) and are enriched on the SN2 fatty acyl unit (Hildebrand and Law, 1964), which is consistent with our data. Extending the low mass accuracy results of nano-ESI-CID-MS, TOF CID-MS (Fig. 2 c) yielded high mass accuracy ions that enabled determination of the atomic composition of the fragments (Fig. 2 c), solving the structures of the unknowns as 6-phosphatidyltrehalose (PT) and 6,6′-diphosphatidyltrehalose (diPT). To our knowledge, this is the first report of trehalose-containing phospholipids.
Next, we sought to determine if the weaker coeluting bands detected in other Salmonella serovars (Fig. 1 b) represented chemically identical molecules. Despite being available in lower quantity and purity, NMR patterns of TLC-purified material from S. Typhimurium and S. Enteritidis were consistent with the lipids from each of these three serovars being chemically identical (Fig. S2 a). Although the lower yields limited NMR analysis and introduced artifacts from water contamination, CID-MS allows specific targeting of ions matching the m/z of PT and diPT. CID-MS independently established all of the major components of these structures, including fragment ions matching m/z values for loss of hexose, loss of dihexose, loss of phosphatidylhexose, and mono- and diacylated phosphatidyl units (Fig. S2 b). Also, S. Typhimurium and S. Enteritidis lipids coeluted with S. Typhi lipids as the characteristic doublets, matching the retention of TLC-MS–proven PT and diPT (Fig. 1 b). The combined TLC, MS, and NMR analyses provide clear evidence for production of PT and diPT by all three species.
Synthesis of diPT
Trehalose is an abundant and common disaccharide in plants and certain bacteria but is rare or absent in mammalian cells. Therefore, trehalose-containing lipids are by definition foreign compounds for the mammalian immune system. M. tuberculosis produces a highly immunogenic trehalose-containing glycolipid, first known as cord factor and later elucidated as 6,6′-trehalose dimycolate (TDM; Adam et al., 1967). TDM is one of the most widely used adjuvants in vaccines and experimental medicine. TDM can be administered as complete Freund’s adjuvant, where TDM is a major, but not the only, immune stimulus (Geisel et al., 2005; Shenderov et al., 2013), as pure TDM, or the simplified structure 6,6′-trehalose dibehenate (Pimm et al., 1979; Holten-Andersen et al., 2004). A major pathway of macrophage activation and enhancement of adaptive immune responses by TDM is via the pattern recognition receptor macrophage inducible C-type lectin (Mincle [or Clec4E]; Ishikawa et al., 2009; Werninghaus et al., 2009). Mincle is a member of the C-type lectin superfamily expressed on macrophages and other myeloid cells.
These considerations provided a strong rationale to test for immune responses to these newly discovered trehalose-based compounds. However, even though MS and NMR spectroscopy suggested high purity of TLC-purified trehalose phospholipids, S. Typhi synthesizes LPS, which is bioactive at picomolar concentrations. As LPS or other minor contaminants could create false-positive results in cellular assays, we undertook the complete chemical synthesis of diPT. Based on prior reports describing naturally occurring Gram-negative bacterial fatty acids (Hildebrand and Law, 1964; Grogan and Cronan, 1997), we first synthesized 9R,10S cyclopropyl fatty acid. This compound, together with palmitic acid, was used to prepare the required diacylglycerol, which along with hexabenzyl trehalose were used to assemble diPT via phosphoramidite coupling and subsequent deprotection. The nine-step synthesis was performed with an overall yield of 69%, resulting in 15 mg of the final product. The final product was validated by HPLC-MS and NMR spectroscopy showing good correspondence of the chemical shifts between the two samples, thereby verifying the proposed chemical structure of diPT (Fig. S3 vs. Dataset 1, f–j).
Access to milligram quantities of a synthetic diPT standard allowed more quantitative assessments of diPT as a component of the S. Typhi lipidome (Dataset 1 k). Using synthetic diPT as an internal standard for HPLC-MS and applying the method of standard additions to the three brightest natural diPT ions, we estimate that diPT is 2.5% of S. Typhi total lipid extract. This is a conservative estimate, because more than six ions are seen, and it generally matches the estimate for diPT of 1.7% derived from TLC (Fig. 1 b). Both results indicate that diPT and PT are among the most abundant lipids in S. Typhi.
S. Typhi trehalose phospholipids are potent Mincle ligands
The overall chemical resemblance between diPT and TDM is that both are symmetrical molecules with a trehalose core that is substituted with lipids attached at the two 6-positions (Fig. 3 b). The key difference is that diPT is substituted with phospholipids, whereas TDM is substituted with mycolic acids. To determine whether diPT is recognized by Mincle, we used a reporter cell line stably transduced with murine Mincle, its signaling partner, FcRγ , and an NF-κB–driven GFP reporter construct. Because hydrophobic glycolipids like TDM are integrated in membranes and do not act in solution, bioassays to mimic this interaction rely on first coating wells with lipid and then adding cells in aqueous media. Despite this technical limitation, dose responses are measurable (Fig. 3 c). The dose responses to TDM and natural diPT were similar and highly sensitive in absolute terms, with cells responding to 5 ng of coated lipid. Pure synthetic diPT was less potent but still recognized in low amounts, with responses seen at 20 ng of coated lipid. Although the potent stimulation of reporter cells by synthetic diPT rules out effects of LPS or other trace bacterial contaminants as mediating response to natural diPT, the reduced GFP signal at high doses of natural diPT could reflect LPS-mediated toxicity. Synthetic sulfoglycolipid (James et al., 2018), a glycolipid substituted at the 2 and 3 positions of trehalose, did not stimulate the Mincle reporter cell line at high doses. Subsequently, we compared murine and human Mincle reporter cell lines and again found highly sensitive responses to synthetic diPT, with activation present at nanogram levels. Human Mincle is marginally more sensitive to diPT than to TDM, while for murine Mincle, the opposite is true (Fig. 3 d). DiPT failed to activate a negative control cell line expressing FcRγ only. Thus, diPT efficiently stimulates Mincle.
Candidate gene approach to biosynthesis
To determine the mechanism of biosynthesis of trehalose phospholipids, we used a candidate gene approach. DiPT was previously unknown, and we could not identify substantially similar lipids in other bacteria that could point us toward candidate synthetic enzymes involved in phospholipid transfer onto trehalose. However, genes for biosynthesis of the two components of diPT, trehalose and PG, are well known, as are those enzymes that produce “substituted PGs” like phosphatidylinositol and CL. Considering PT and diPT as carbohydrate-substituted PG and mining the S. Typhi genome, we identified 12 candidate genes for biosynthesis. Among these 12 candidates are four biosynthetic enzymes that produce trehalose from either glucose 6-phosphate and UDP-glucose (OtsA and OtsB) or from α(1–4)-linked glucose polymers (TreY and TreZ). A fifth enzyme shares a domain with PapA3, which is responsible for lipid transfer onto mycobacterial trehalose phleate (EntF; Burbaud et al., 2016). In addition, we tested four enzymes involved in synthesis of phospholipids (PagP, PgpA, PgpB, and PgpC) and three putative CL synthases (ClsA, ClsB, and ClsC), which are thought to transfer phosphatidyl units onto PG.
Clsb is essential for trehalose phospholipids, but not CL
We tested 12 candidate diPT and PT biosynthetic genes using two sets of single gene knockouts in S. Typhimurium, which were generated independently as kanamycin or chloramphenicol selected mutants (Porwollik et al., 2014). Because diPT did not resolve well on normal-phase HPLC-MS, we developed a suitable reverse-phase HPLC method that could reliably detect both PT and diPT. An equivalent mass of total lipid, as determined by weighing dry lipid, from each single gene knockout (Fig. 4 a) was analyzed using phosphatidylethanolamine (PE) as a secondary loading control (Table S1). This screen provided a clear result that was reproducible in both sets of mutants and for both lipids, with essentially all or nothing effects for each gene studied. Of 12 candidates, 11 genes (aas, clsA, entF, otsA, ostB, pagP, pgpA, pgpB, pgpC, treY, and treZ) were nonessential for biosynthesis (Fig. 4 a). There was no trehalose in the culture media, yet trehalose phospholipids persisted in single gene knockouts in trehalose biosynthesis pathways. This outcome likely resulted from the existence of independent routes to trehalose production, which involve either glucose-6-phosphate (OtsA and OtsB) or maltose and maltodextran (TreY, TreZ, TreT, and TreS) intermediates. Only the clsB single gene knockout showed loss of PT and diPT in both mutant sets, while PE levels were unchanged (Fig. 4, a and b; Table S1). That this loss of PT and diPT was observed in two independently generated single gene knockout sets reduces the likelihood that lipid loss was due to an unrelated second hit occurring elsewhere in the genome of clsB mutants. Even so, we genetically complemented a clsB knockout using an arabinose-inducible system. Only when treated with arabinose, the reconstituted clsB knockout (ΔclsB::clsB) produced PT and diPT, demonstrating that clsB is necessary and sufficient for its production (Fig. 4 c). In the WT strain, arabinose treatment partially diminished diPT synthesis for unknown reasons, which might partly explain why the arabinose-treated, genetically complemented strains did not fully restore lipid production to WT levels. Taken together, these data point to ClsB as an essential enzyme for the synthesis of trehalose phospholipids.
The acronym clsB (previously known as ybhO or f413) stands for CL synthase B (Guo and Tropp, 2000), an assignment based on sequence homology rather than direct demonstration of this enzymatic role. Using HPLC-MS to detect signals (m/z 1,390) at the retention time of a CL standard, we observed no significant change in the intensity or shape of the biphasic peak assigned to CL in the clsB knockouts (Fig. 4 d and Table S1). We conclude that clsB is nonessential and, under the conditions tested, does not affect CL concentration. Although an adjunctive role in CL biosynthesis might be masked by parallel functions of ClsA and ClsC, the simplest interpretation is that ClsB is, despite its name, not a CL synthase. Instead of coupling phosphatidic acid to PG, ClsB could couple phosphatidic acid to trehalose. Consistent with this interpretation, previous work has also suggested the likely existence of an unknown substrate for ClsB (Tan et al., 2012; Li et al., 2016) and that ClsA or ClsC is sufficient for CL synthesis.
The clsB knockout strain represented a new tool to determine if diPT and PT, considered among all cell wall lipids, were essential to stimulate Mincle. First, we found that total lipid extract from WT S. Typhimurium stimulated Mincle (Fig. 4 e). Although lower Mincle reporter response was seen compared with using pure diPT, this outcome was expected based on the much lower concentration of diPT among all lipids, the plate-bound nature of the lipid presentation in this assay, and the possible existence of antagonists or toxic factors among total cell wall lipids. However, the total lipid extract from ΔclsB bacteria did not stimulate human or murine Mincle reporter lines (Fig. 4 e), suggesting that among all lipids in S. Typhimurium, the diPT pathway is the dominant or sole source of Mincle ligands.
Genetic and chemical phylogeny of clsB and diPT
After the initial identification of trehalose phospholipids in S. Typhi, S. Typhimurium, S. Enteritidis, and S. Paratyphi, we next asked how broadly distributed these compounds are among bacteria. Also, given that PT and diPT were discovered in enteric bacteria, the specific question arises as to whether humans are continuously exposed to trehalose phospholipids via intestinal microbiota or through infection with enteric pathogens. First, we used a basic local alignment search tool (BLAST)–reverse BLAST-based approach to ask whether a S. Typhi ClsB orthologue is present in genomes of closely and distantly related bacterial genera (Fig. 5, a and b). The four studied S. enterica enterica strains had clsB genes predicted to encode proteins that were 100% identical, and S. bongori encoded a 99% identical protein. The genomes of species belonging to β- and γ-proteobacterial Escherichia, Shigella, Pseudomonas, and Bordetella encoded ClsB proteins that ranged from 51% to 87% identity to S. Typhi ClsB protein. We found no identifiable ClsB orthologues among α- and ε-proteobacteria or among more distant groups such as Gram-positive bacteria and actinobacteria. In all bacterial species found to have a ClsB orthologue, a ClsA orthologue, which has an established role in CL synthesis, was also present. Furthermore, phylogenetic analysis showed that ClsA, ClsB, and ClsC form distinct monophyletic branches clearly separating the three enzymes (Fig. 5 b).
We next determined the extent to which such phylogenetic comparisons using gene homology analysis correctly predict the diPT and PT chemotypes. Therefore, we used HPLC-MS to chemotype 16 broadly divergent species highlighted in the bioinformatic analysis (Fig. 5, c and d). Despite high (99%) similarity with the S. Typhi clsB gene, we did not detect diPT or PT in S. bongori or in other species in which we found a clsB orthologue, including Pseudomonas aeruginosa, Bordetella pertussis, and Shigella species. Whether this is due to functional differences of the enzyme itself or instead to differences in its species-specific expression or interaction with other enzymes in the pathway will require further detailed studies beyond the scope of this analysis. A rather unexpected finding, pointing toward unknown contribution of expression level or other interacting proteins, was the distribution of diPT and PT among the six E. coli strains tested. Despite identical culture conditions and identical coding regions of their clsB genes, we found that four of the six strains synthesized the compounds (clinical isolates CVI-7, CVI-19; laboratory strains DH10B, BW25113), while two did not (ATCC25922, DH5α). We conclude that having a clsB gene is required for the synthesis of PT and diPT, but as-yet-unknown additional requirements, like gene transcription or production of precursor molecules or the lack of suppressors, must be met. In addition, genes annotated as clsB, based on the full-length sequence, may not have equivalent functionality because of small changes in key residues at the catalytic site or other critical sites in the protein.
Our results pointed to diPT as a strain-specific phenotype present in certain gastrointestinal pathogens, but the unexpected detection of diPT in E. coli strains raised questions about its possible role in the intestinal microbiome. Although feces represents a complex mixture of bacteria, E. coli is a common and abundant species in human feces, and in vitro evidence for high trehalose phospholipid production by this and other enteric bacteria raise the question of whether this compound can be detected in stool. Therefore, we looked for trehalose phospholipids in mouse and human stool samples by HPLC-MS. Neither PT nor diPT was detected in any of 10 human fecal samples, nor was it detected in murine ileum, colon, or cecum contents. This result held true for standard HPLC-MS readouts and also when signals were amplified 40- to 100-fold to assess the diPT signal at the limit of detection by MS (Fig. 5 e). Because stool contains a multitude of bacteria and fungi as well as partly digested food and host-derived material, we considered possible false-negative results by cross-suppression from unrelated compounds. However, separation by HPLC before ionization greatly reduces cross-suppression. A spike in positive control of an S. Typhi lipid extract containing an estimated amount of 20 µg PT and diPT generated strong trehalose phospholipid signals (Fig. 5 e). Thus, trehalose phospholipids appear to be undetectable or absent from the normal intestinal microbiota.
Overall, these data indicate that production of diPT is a strain- or serovar-specific phenotype and varies among isolates of the same species. ClsB orthologue searches are not sufficient to predict the presence of diPT, highlighting the usefulness of lipid chemotyping assays. Further, these results reveal a general pattern that diPT is restricted to a subset of Gram-negative bacteria; it is found in pathogens, but not gastrointestinal flora of healthy donors. Considering its role as an agonist for Mincle, a widely expressed innate immune pattern recognition receptor in humans, diPT becomes a candidate for adjuvant development, control of virulence, and a new marker of clinical interest to physicians.
Discussion
These data identify ClsB as essential for production of PT and diPT in the major disease-causing serovars of S. enterica enterica. Although trehalose is common in plants and bacteria, trehalose phospholipids were previously unknown. Thus, the discovery and detection of diPT and PT was entirely unexpected and prompts the reconsideration of the function of the enzyme known as ClsB. ClsA condenses two PG molecules to form CL, whereas ClsC condenses PE and PG (Tan et al., 2012; Rossi et al., 2017). Despite its name and homology to ClsA and ClsC, the function of ClsB has been less clear. E. coli ClsB mediates PG turnover to CL outside of cells, but in other experiments, it does not affect CL content in E. coli (Guo and Tropp, 2000; Tan et al., 2012) or Shigella flexneri (Rossi et al., 2017). Thus, it was previously suggested ClsB might actually use different, unknown, substrates or produce lipids other than CL (Tan et al., 2012), a hypothesis that is directly ruled in by our data. Other members of the phospholipase D superfamily cleave phosphates to transfer phosphatidic acid to an acceptor, and here we show that S. Typhimurium clsB is absolutely required for PT and diPT biosynthesis, suggesting that trehalose could be an acceptor for phosphatidic acid transfer. While in vitro studies of recombinant ClsB are required to test this hypothesis, our findings cast further doubt on a gene function corresponding to its name as a CL synthase.
Here, we identified murine and human Mincle as cellular receptors for diPT. Members of the C-type lectin family of receptors, which also includes Dectin-1, Dectin-2, and DC-SIGN, can activate monocytes, macrophages, and myeloid dendritic cells via an immunoreceptor tyrosine-based activation motif (ITAM) or ITAM-like domains that act as kinases to turn on phospho-Syk signaling pathways (Del Fresno et al., 2018). The crystal structure of human Mincle identifies a calcium-binding site positioned near a proposed carbohydrate-binding site, which in turn is located adjacent to a hydrophobic patch (Feinberg et al., 2013; Furukawa et al., 2013). Mincle activation occurs in response to 1-linked and 6-linked glucose–containing lipids (Behler-Janbeck et al., 2016; Decout et al., 2017; Nagata et al., 2017). Comparative analysis with other C-type lectins, mutational analysis, and molecular modeling suggest that the glucose or trehalose sugars common to natural Mincle ligands bind in the carbohydrate-binding site (Feinberg et al., 2013; Furukawa et al., 2013; Söldner et al., 2018). The 6,6′-linked C20 fatty acids in trehalose dibehenate and C80 mycolic acids in TDM are proposed to bind in the hydrophobic patch. Thus, diPT contains the two key chemical elements previously proposed to mediate Mincle binding. However, because there is no precedent for Mincle activation by phospholipids, prior models have not accounted for putative interactions with an anionic phosphoglycerol unit. There is no identifiable cationic binding site between the carbohydrate-binding site and the hydrophobic patch on Mincle. However, docking simulations suggest that the most proximal part of 6-linked alkyl chains, corresponding to the two phosphoglycerol units in diPT, do not contact Mincle (Söldner et al., 2018). Thus, existing models predict diPT docking in a manner in which the phosphate moieties act as a bridge between the sugar- and lipid-binding epitopes, rather than directly contacting Mincle.
These structural considerations and the precedent of TDM provide a rationale for adjuvant development using synthetic modifications of the natural diPT structure. Freund’s adjuvant is widely used in experimental biology and provided the basis for use of pure TDM as an adjuvant. Subsequently, the chemically simplified and more hydrophilic adjuvant 6,6′-trehalose dibehenate was developed as a further refinement of the TDM natural structure (Pimm et al., 1979; Holten-Andersen et al., 2004). Unlike these molecules, diPT contains two phospholipid units, which further promote water solubility, a favorable feature in adjuvant development. This chemical consideration, along with the high potency of synthetic diPT for human Mincle agonism, supports further modification of the diPT chemical scaffold for adjuvant development. In particular, the chemical syntheses reported here for diPT can be modified to include simplified lipid moieties to further increase water solubility without affecting the Mincle-binding regions of the molecule.
Results from reporter lines rule in both human and mouse Mincle as receptors for diPT. Conversely, the loss of Mincle agonism in the clsB knockout strongly suggests other unrelated compounds in S. Typhimurium do not redundantly agonize Mincle. These data establish key receptor–ligand interactions, raising new questions about the broader cell biology of macrophage response. For example, whether PT and diPT activate other innate immune receptors is unknown, and the specific subtypes of myeloid cells or macrophages activated by this system are not yet understood. Although they are structurally related and nearly equipotent, the extent to which Gram-negative bacterial and mycobacterial agonists have equivalent function is not yet known. The study of these new natural and synthetic Mincle agonists can be used to test certain questions and controversies that have arisen through study of mycobacterial cord factor, TDM (Ishikawa et al., 2009). In addition to activating Mincle, cord factor has been reported to activate TLR4 and related MyD88-based signaling pathways (Geisel et al., 2005; Oda et al., 2014). The phosphate moieties of PT and diPT render these compounds as “phospholipid” variants of cord factor. This and the mono- versus divalent nature of PT and diPT might influence the final outcomes of cellular activation.
TDM synthesis in M. tuberculosis takes place by transfer of mycolic acids to the 6 and 6′ positions of trehalose by the enzymes proteins Ag85A, Ag85B, and Ag85C (Belisle et al., 1997; Backus et al., 2014). ClsB is not related to Ag85 proteins but is instead a member of the phospholipase D family of proteins. Therefore, the TDM and diPT-generating enzymes and the genes that encode these enzymes are unrelated, but their products are similar with regard to structure and function. Thus, the separate evolution of differing enzymes that lead to molecules with highly similar structure that trigger the same receptor might be a previously unrecognized example of evolved functional convergence of Gram-negative bacteria and mycobacteria.
Whereas prior work on Mincle has emphasized mycobacterial and fungal ligands, the data presented here generate a strong link with Gram-negative enteric bacteria. We demonstrate that five serovars or strains of common Gram-negative bacteria produce Mincle ligands. Further, although we do not yet know if diPT has other functions, other lipids in S. Typhimurium cannot substitute for ClsB-dependent lipids for Mincle activation. The emerging serovar- and species-specific patterns of diPT production suggest that diPT is likely restricted to Gram-negative enteric bacteria. We failed to detect diPT in the enteric flora of healthy humans and mice but instead detect the compound in some E. coli strains and serovars of S. enterica enterica, with highest expression in S. Typhi, which is the cause of typhoid fever. These correlations between virulence and diPT expression support directed studies of diPT as a virulence factor. Given the differential high, low, or absent levels of diPT within strains or serovars of the same species, as well as the low predictive value of ClsB gene identity for diPT, lipid chemotyping of clinical samples will be needed to understand the role of diPT in infection. The high production of diPT by S. Typhi, its potent agonism of a major activating receptor on macrophages, and the presence of diPT among bacteria that cause enteric fever syndromes now raise a key question for future studies: Does diPT contribute to fever and sepsis that define enteric fever syndromes?
The World Health Organization estimates that diarrheal diseases, most commonly caused by enteric Gram-negative pathogens, remain the ninth leading cause of death worldwide, the fourth leading cause of death in developing nations, and a major cause of death among children (World Health Organization, 2018). Accordingly, S. Typhi and related serovars are priority pathogens that have spurred development of diagnostics, drug treatments, antisepsis regimens, and vaccines (Andrews et al., 2019). Both clinical and basic research on Gram-negative bacterial lipid endotoxins focuses on LPS–TLR4 interactions, which is one of the most extensively studied and widely recognized receptor–ligand pairs in immunology (Nature Reviews Immunology, 2011). Given this high investment in LPS and extensive attention to disease-causing serovars, it is striking that two immunogenic lipids that are abundant in the most virulent serovar could go undiscovered in decades of research on Salmonella species. Here, we describe diPT as an immunogen hiding in plain sight in bacterial membranes. This example underscores that the membrane content of Gram-negative bacterial cell walls, particularly the complex glycolipids in the outer membrane interfacing with the human host, remain an understudied resource for immunogens and virulence factors. The lipidomic profiles reported here identify hundreds of molecular species that differ by serovar, pointing to specific future paths for detecting serovar-specific lipid markers, including other molecules that could control host response.
Materials and methods
Bacterial cultures and total lipid extraction
In addition to the bacterial strains listed in Table S2, we used S. Typhimurium 14028s strains in which single genes are replaced by a cassette containing a kanamycin resistance gene oriented in the sense direction or a chloramphenicol resistance gene oriented in the antisense direction (Porwollik et al., 2014). A single colony was picked from a plate, transferred to a 3-ml starter culture, and incubated overnight at 37°C while shaking. 1 ml of a starter culture was added to 500 ml of medium and incubated overnight at 37°C while shaking. Bacteria were centrifuged for 15 min at 3,500 rotations per minute and washed twice with PBS. Lipid were extracted by rocking the pellet in organic solvent for 1 h at 20°C, centrifugation for 10 min at 3,500 rotations per minute, and collection of the supernatant. Solvents used for extraction were HPLC grade 2:1 chloroform:methanol (C:M; Merck), 1:1 C:M, and 1:2 C:M. The three supernatants were pooled and dried and lipids were dissolved and stored in 1:1 C:M. Murine colon, cecum, or small intestinal content (0.5–1.0 g) or fresh human stool (1.0–2.0 g) was suspended in 10:1 CH3OH:0.3% NaCl in water and subjected to a series of extractions against petroleum ether, 9:10:3 CHCl3:CH3OH:0.3% NaCl in water, and 5:10:4 CHCl3:CH3OH:0.3% NaCl in water. Extracts were combined, dried, and weighed. Spiked samples were spiked directly after initial suspension in 10:1 methanol:0.3% NaCl in water.
Comparative lipidomics
Total lipid extracts of triplicate cultures of S. Typhi (Quailes) and S. Paratyphi A (NVGH308) were separated using GL-Sciences Inertsil Diol 3 µm 2.1 × 150 mm normal-phase HPLC column equipped with Varian Monochrom 3 µm × 4.6 mm Diol guard column. Lipids were measured on an Agilent Quadrupole Time-of-Flight Accurate-Mass QTOF LC/MS G6520B instrument in positive mode as described previously (Layre et al., 2011). Data were analyzed using Mass Hunter (Agilent), LIMMA (Ritchie et al., 2015), and XCMS (Smith et al., 2006).
TLC
DURASIL-25 TLC plates (Macherey-Nagel) were precleared with C:M:H2O (60:30:6 [vol:vol:vol]) and dried. Bacterial total lipid extract (300 µg) or purified standard (40 µg) were applied and resolved with C:M:H2O (60:30:6 [vol:vol:vol]). The plates were dried and either stained for analytical purposes (Fig. 1 b) or used for specific lipid isolation. Staining was performed by spraying 3% copper acetate monohydrate (Sigma-Aldrich) in 8% phosphoric acid (Merck) on the plate and baking at 140°C. Isolation of specific lipids was performed after spraying the plate with water, which makes lipids bands temporarily visible. Lipids of interest were marked, the plate was dried, and the silica layer containing the lipid was scraped of the glass plate. The lipid was isolated from the silica by rocking the silica for 1 h in 1:1 C:M. After rocking, the sample was centrifuged and the supernatant containing the lipid was stored. PE (850758P) and PG (840503P) standards were from Avanti Lipids.
Analytical MS
For mass determination and higher-order CID-MS, purified lipids were dissolved in methanol and measured by nano-ESI-MS in the positive mode (LXQ Linear Ion Trap Mass Spectrometer with MSn software; Thermo Fisher Scientific). For optimal detection and quantification of PT, diPT, pPE, and CL with high mass resolution, reverse-phase HPLC-MS was performed on an accurate-mass QTOF LC/MS G6520 instrument. In 15 min, a gradient from 100% solvent A (5% H2O in MeOH with 2 mM ammonium formate) to 100% solvent B (10% cyclohexane in 1-propanol with 3 mM ammonium formate) was run on an Agilent Poroshell 120Å, EC-C18, 1.9 µm column equipped with an Agilent EC-C18, 3.0 × 5 mm, 2.7 µm guard column. The elution time of CL was determined using a synthetic standard (750332; Avanti). For the quantification of natural diPT, the standard addition analytical method was used. S. Typhi total lipids (1 mg/ml) were spiked with a series of known concentrations of synthetic diPT (0, 5, 10, and 15 µg/ml) and subjected to reverse-phase HPLC-MS analysis. The peak areas of extracted ion chromatograms of m/z 1,626.032 (fatty acid chain length and unsaturation: C66:2) were plotted against the concentrations of the spiked synthetic standard. The extrapolated number on the x axis is the estimated concentration of natural diPT (m/z 1,626.032) in the S. Typhi total lipid extract. The peak area of two other fatty acid variants (m/z 1,654.064 C68:2 and m/z 1,648.016 C68:5) were also extracted and compared with the peak area of m/z 1,626.032 to determine their concentrations. The estimated natural diPT percentage was obtained by the combined concentrations of three major chain length analogues divided by the total injected lipid concentration (1 mg/ml).
NMR
NMR spectra were recorded on an Oxford 600-MHz magnet (600 MHz for 1H, 151 MHz for 13C) Bruker AVANCE II Console system equipped with a 5-mm Prodigy HCN TXI cold probe and an Agilent 700-MHz magnet (700 MHz for 1H, 176 MHz for 13C) with a DDR2 console equipped with an Agilent triple-resonance helium cold probe. The 31P-decoupling experiments were performed on a Varian 400-MHz magnet equipped with a 5-mm AutoX OneProbe. Chemical shifts are reported in ppm and coupling constants (J) in Hz. The samples were measured in MeOH-d4 of which the residual solvent resonance was used as an internal standard (δ 3.31 for 1H, δ 49.00 for 13C). The 1H-NMR spectra were assigned using COSY, TOCSY, multiplicity-edited HSQC (135DEPT-HSQC) and heteronuclear multiple bond correlation. The TOCSY experiments were performed with a MLEV17 mixing scheme with a 100 ms spin lock (Gheysen et al., 2008). The data are reported as follows: chemical shifts (δ), multiplicity (d, doublet; dd, double doublet; ddd, double double doublet; dt, double triplet; m, multiplet; q, quartet; s, singlet; td, triple doublet; t, triplet), coupling constants J (Hz), and integration (Dataset 1).
Reconstitution of ClsB in S. Typhimurium ΔclsB mutant
ClsB was isolated via PCR using forward primer 5′-CGCGGATCCGATACGGTAACGCGGTTCTTTCT-3′ and reverse primer 5′-GATGAATTCCGGCCGCAATAAAGCCGTCCAAG-3′. The sequence of the PCR product was verified by Sanger sequencing. A second PCR was performed using forward primer 5′-TAGAGGAATAATAAATGATGAAATGCGGCTGGCGTGAAGGTAATCAA-3′ and reverse primer 5′-GTTAGGGCTTCACTCCTGTATTTTC-3′ to make it suitable for cloning into the pBAD-TOPO vector (Life Technologies). Correct insertion was confirmed by restriction digestion with BsmI (New England Biolabs). A 2-ml S. Typhimurium ΔclsB culture was grown overnight in Luria-Bertani broth (LB) with chloramphenicol (25 µg/ml). 0.5 ml of the starter culture was diluted in 50 ml LB with CM. This culture was incubated for 2.5 h at 37°C while shaking. The culture was washed three times with ice-cold 10% glycerol and used for electroporation by adding 50 ng pBAD-TOPO-CLSB to 40 µl bacteria. A 0.1-cm cuvette was pulsed with 1.6 kV, capacitance of 2 5 µF, for ∼5 µs. Cells were grown for 4 h in 1 ml LB without antibiotics at 37°C before plating on LB plates with 100 µg/ml ampicillin. After overnight incubation, colonies were picked and grown overnight in 3 ml LB with chloramphenicol and ampicillin. Insertion of the clsB gene was confirmed by PCR. Total lipid extraction was performed on cultures of a reconstituted clone grown with or without induction with 0.2% arabinose.
Mincle activation assay
Lipids were diluted in 20 µl isopropanol per well of a flat-bottom 96-well plate. Isopropanol was evaporated, after which 3 × 104 reporter cells in 100 µl medium were added per well. Reporter cell lines were FcRγ-, murine Mincle and FcRγ–, or human Mincle and FcRγ–expressing 2B4-NFAT-GFP cells (Yamasaki et al., 2008). After 18 h at 37°C, cells were washed with PBS 1% BSA, and GFP expression was measured on the FACSCanto II (BD). Data were analyzed using FlowJo software.
Identification of ClsB in other bacterial species
For a selection of bacterial species, the 16S sequence was downloaded from the SILVA database (https://www.arb-silva.de). The Alignment, Classification, and Tree service from the SILVA server was used with default settings to align the sequences and generate a phyologenetic tree. The unrooted tree was visualized using the ITOL server (http://itol.embl.de/). Protein sequences of ClsA, ClsB, and ClsC of S. Typhi were used for a BLASTP search in bacterial genomes using the National Center for Biotechnology Information server. For each Cls protein, the percentage identity to the top hit in each bacterial species was determined. A BLASTP search of this top hit in the S. Typhi genome was performed to determine whether the closest hit was the same Cls protein used in the initial BLASTP search. For the phylogenetic analysis of Cls proteins, the amino acid sequences were aligned using muscle (Edgar, 2004) and informative sites selected with trimAl (Capella-Gutiérrez et al., 2009), and the tree (as shown in Fig. 5 b) was calculated with MrBayes (Ronquist et al., 2012) for 1 million generations under the mixed amino acid model and a 25% burning for consensus tree generation; a separate calculation using iqtree (Nguyen et al., 2015) under the automatic model selection resulted in the same topology. Trees and input alignments are available at https://figshare.com/s/daf0e062de746281ed52.
Synthesis of diPT
We developed a stereoselective chemical synthesis of diPT, including de novo synthesis of a fatty acid with a 9R,10S cis-cyclopropyl ring. The cyclopropyl fatty acid was prepared as described by Shah et al. (2014). Briefly, rhodium-catalyzed cyclopropenation of octyne with ethyl diazoacetate was followed by resolution of the enantiomers via diastereomer formation and chromatography (Liao et al., 2004). The resulting enantiopure cyclopropenes were then converted to the desired enantiomeric fatty acids over several synthetic steps including a Wittig reaction (Coxon et al., 1999) and a diimide reduction (Smit et al., 2008). Diacylglycerol was prepared by epoxide ring opening (Jacobsen et al., 1997) of protected S-glycidol with palmitic acid, followed by esterification of the resulting hydroxyl group with the 9R,10S cyclopropyl fatty acid. Careful deprotection, to avoid acyl shift, produced the free diacylglycerol (Fodran and Minnaard, 2013). Suitably protected trehalose (Gilbertson and Chang, 1995) was converted to bis-2-cyanoethyl N,N-diisopropylchlorophosphoramidite and coupled with diacylglycerol, mediated by dicyano imidazole, and then immediately followed by oxidation to the phosphate. Finally, deprotection of the phosphates and removal of the benzyl protecting groups provided the desired product.
Online supplemental material
Fig. S1 shows the identification of TLC-isolated lipids from S. Typhi as hexose phospholipids. Fig. S2 shows a comparison of PT and diPT from S. Typhi, S. Enteritidis, and S. Typhimurium. Fig. S3 shows NMR of synthetic diPT. Table S1 lists the abundance of PT, diPT, PE, and CL in S. Typhimurium single gene knockouts. Table S2 lists bacterial strains and culture media. Dataset 1 shows the identification of phosphatidyltrehalose and diphosphatidyltrehalose from S. Typhi by NMR and quantification.
Acknowledgments
We thank B. Kuipers and E. Lambert for assistance with bacterial culture, J. Kemmink for NMR spectroscopy studies, and R. Cotton for critically reading the manuscript and advice.
This work was supported by the Nederlands Wetenschappelijk Onderzoek (grant 824.02.002) and the National Institutes of Health (grant AI116604 to D.B. Moody). P. Reinink was supported by the European Molecular Biology Organization short-term fellowship (16-2015). A.J. Minnaard and V.K. Mishra are supported by the University of Groningen. S. Yamasaki was supported by the Ministry of Education, Culture, Sports, Science and Technology (grants 26293099 and 26110009) and the Japan Agency for Medical Research and Development (grants JP17gm0910010 and JP17ak0101070). V. Cerundolo, G. Napolitani, and D.B. Moody were supported by the UK Medical Research Council (grant MR/K021222/1). V. Cerundolo was also supported by Cancer Research UK (grant C399/A2291) and by the National Institute for Health Research Oxford Biomedical Research Centre. S. Porwollik and M. McClelland were supported by the US Department of Agriculture (grants 2017-67017-26180 and 2017-67015-26085) and the National Institutes of Health (R01AI136520 and contract HHSN272200900040C).
The authors declare no competing financial interests.
Author contributions: I. Van Rhijn, A.J. Minnaard, D.B. Moody, S. Yamasaki, V. Cerundolo, and M. McClelland conceived and designed the experiments. V.K. Mishra carried out chemical synthesis. P. Reinink performed bacteriology experiments and genomic searches. P. Reinink and E. Ishikawa performed immunology experiments. J.A. Mayfield performed statistical analyses. J. Buter, T-Y. Cheng, V.K. Mishra, A.J. Minnaard, and P. Reinink performed chemical analyses. S. Yamasaki, M. McClelland, P.T.J. Willemsen, E. Heinz, G. Dougan, P.J. Brennan, V. Cerundolo, G. Napolitani, S. Porwollik, and C.A. van Els contributed reagents or original ideas. I. Van Rhijn and D.B. Moody wrote the paper.
References
Author notes
P. Reinink and J. Buter contributed equally to this paper.